license: apache-2.0
base_model: sentence-transformers/all-mpnet-base-v2
tags:
- generated_from_trainer
metrics:
- accuracy
model-index:
- name: IKT_classifier_conditional_best
results: []
widget:
- text: >-
Brick Kilns. Enforcement and Improved technology use. Residential and
Commercial. Enhanced use of energy- efficient appliances in household and
commercial buildings. F-Gases. Implement Montreal Protocol targets.
Industry. Achieve 10% Energy efficiency in the Industry sub-sector through
measures according to the Energy Efficiency and Conservation Master Plan
(EECMP). Agriculture. Implementation of 5925 Nos. solar irrigation pumps
(generating 176.38MW) for agriculture. Brick Kilns. 14% emission reduction
through Banning Fixed Chimney kiln (FCK), encourage advanced technology
and non-fired brick use. Residential and Commercial.
example_title: UNCONDITIONAL
- text: >-
Achieve 20% Energy efficiency in the Industry sub-sector through measures
according to the Energy Efficiency and Conservation Master Plan (EECMP).
Promote green Industry. Promote carbon financing. Agriculture. Enhanced
use of solar energy in Agriculture. Agriculture. Implementation of 4102
Nos. solar irrigation pumps (generating 164 MW) for agriculture. Brick
Kilns. Enforcement and Improved technology use. Brick Kilns. 47% emission
reduction through Banning Fixed Chimney kiln (FCK), encourage advanced
technology and non-fired brick use. Residential and Commercial.
example_title: CONDITIONAL
- text: >-
The GHG emission reductions from Cairo metro network includes the
rehabilitation of existing lines 1, 2, and 3. • The development of
Alexandria Metro (Abu Qir – Alexandria railway line) and rehabilitation of
the Raml tram line.
example_title: CONDITIONAL
IKT_classifier_conditional_best
This model is a fine-tuned version of sentence-transformers/all-mpnet-base-v2 on the GIZ/policy_qa_v0_1 dataset. It achieves the following results on the evaluation set:
- Loss: 0.5371
- Precision Macro: 0.8714
- Precision Weighted: 0.8713
- Recall Macro: 0.8711
- Recall Weighted: 0.8712
- F1-score: 0.8712
- Accuracy: 0.8712
Model description
The model is a binary text classifier using 'sentence-transformers/all-mpnet-base-v2' and fine-tuned on text sourced from national climate policy documents.
Intended uses & limitations
The classifier assigns a class of 'Unconditional' or 'Conditional' to denote the strength of commitments as portrayed in extracted passages from the documents. The intended use is for climate policy researchers and analysts seeking to automate the process of reviewing lengthy, non-standardized PDF documents to produce summaries and reports.
Due to inconsistencies in the training data, the classifier performance leaves room for improvement. The classifier exhibits reasonably good training metrics (F1 ~ 0.85), balanced between precise identification of true positive classifications (precision ~ 0.85) and a wide net to capture as many true positives as possible (recall ~ 0.85). When tested on real world unseen test data, the performance was subptimal for a binary classifier (F1 ~ 0.5). However, testing was based on a small out-of-sample dataset containing it's own inconsistencies. Therefore classification may prove more robust in practice.
Training and evaluation data
The dataset is comprised of data from 2 sources:
- ClimateWatch NDC Sector data
- IKI TraCS Climate Strategies for Transport Tracker implemented by GIZ and funded by theInternational Climate Initiative (IKI) of the German Federal Ministry for Economic Affairs and Climate Action (BMWK).
From the first source, we take
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 4.112924307850544e-05
- train_batch_size: 3
- eval_batch_size: 3
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 400.0
- num_epochs: 5
Training results
Training Loss | Epoch | Step | Validation Loss | Precision Macro | Precision Weighted | Recall Macro | Recall Weighted | F1-score | Accuracy |
---|---|---|---|---|---|---|---|---|---|
0.6658 | 1.0 | 698 | 0.7196 | 0.7391 | 0.7381 | 0.7102 | 0.7124 | 0.7028 | 0.7124 |
0.6301 | 2.0 | 1396 | 0.4965 | 0.8073 | 0.8075 | 0.8071 | 0.8069 | 0.8069 | 0.8069 |
0.5252 | 3.0 | 2094 | 0.5307 | 0.8300 | 0.8297 | 0.8279 | 0.8283 | 0.8279 | 0.8283 |
0.3513 | 4.0 | 2792 | 0.5261 | 0.8626 | 0.8627 | 0.8626 | 0.8627 | 0.8626 | 0.8627 |
0.2979 | 5.0 | 3490 | 0.5371 | 0.8714 | 0.8713 | 0.8711 | 0.8712 | 0.8712 | 0.8712 |
Framework versions
- Transformers 4.31.0
- Pytorch 2.0.1+cu118
- Datasets 2.13.1
- Tokenizers 0.13.3