metadata

library_name: transformers
license: apache-2.0
base_model: pszemraj/tFINE-850m-24x24-v0.4-flan_aug
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: tFINE-850m-24x24-v0.4-flan_aug-infinity-instruct-7m-T2T_en-1024-v5
    results: []

tFINE-850m-24x24-v0.4-flan_aug-infinity-instruct-7m-T2T_en-1024-v5

This model is a fine-tuned version of pszemraj/tFINE-850m-24x24-v0.4-flan_aug on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.1526
Rouge1: 40.1804
Rouge2: 23.1008
Rougel: 32.3484
Rougelsum: 38.2103
Gen Len: 422.225
Num Input Tokens Seen: 421585440

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 3e-05
train_batch_size: 4
eval_batch_size: 4
seed: 776444
gradient_accumulation_steps: 32
total_train_batch_size: 128
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: constant_with_warmup
lr_scheduler_warmup_ratio: 0.05
num_epochs: 1.0

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len	Input Tokens Seen
1.8808	0.0807	1000	1.7883	24.1946	12.2099	20.4185	22.251	636.465	35147692
1.6545	0.1613	2000	1.5985	28.9492	15.3233	23.871	26.9919	577.04	70510224
1.5522	0.2420	3000	1.4907	30.4033	16.1354	24.7244	28.5037	537.77	105707144
1.5059	0.3227	4000	1.4204	34.0294	19.2608	27.9322	32.3166	522.495	140722844
1.4346	0.4034	5000	1.3636	34.4104	19.4149	28.1022	32.7299	494.68	175639924
1.3912	0.4840	6000	1.3159	36.5059	21.2447	30.116	34.7303	469.885	210409328
1.3148	0.5647	7000	1.2807	37.0123	21.3666	30.11	35.0891	458.28	245601908
1.2859	0.6454	8000	1.2492	37.05	21.0468	29.7988	35.1882	452.495	280866724
1.298	0.7260	9000	1.2211	36.6966	20.8189	29.7115	34.7528	464.37	316042068
1.2834	0.8067	10000	1.1979	37.7181	20.9926	30.3857	35.8681	446.26	351056548
1.2577	0.8874	11000	1.1752	39.3539	23.0123	31.9005	37.4941	424.445	386471860
1.193	0.9680	12000	1.1526	40.1804	23.1008	32.3484	38.2103	422.225	421585440

Framework versions

Transformers 4.45.1
Pytorch 2.4.1+cu124
Datasets 3.0.1
Tokenizers 0.20.0