pszemraj's picture
Model save
e4b109a verified
|
raw
history blame
3.34 kB
metadata
library_name: transformers
license: apache-2.0
base_model: pszemraj/tFINE-850m-24x24-v0.4-flan_aug
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: tFINE-850m-24x24-v0.4-flan_aug-infinity-instruct-7m-T2T_en-1024-v5
    results: []

tFINE-850m-24x24-v0.4-flan_aug-infinity-instruct-7m-T2T_en-1024-v5

This model is a fine-tuned version of pszemraj/tFINE-850m-24x24-v0.4-flan_aug on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.1526
  • Rouge1: 40.1804
  • Rouge2: 23.1008
  • Rougel: 32.3484
  • Rougelsum: 38.2103
  • Gen Len: 422.225
  • Num Input Tokens Seen: 421585440

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 776444
  • gradient_accumulation_steps: 32
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant_with_warmup
  • lr_scheduler_warmup_ratio: 0.05
  • num_epochs: 1.0

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len Input Tokens Seen
1.8808 0.0807 1000 1.7883 24.1946 12.2099 20.4185 22.251 636.465 35147692
1.6545 0.1613 2000 1.5985 28.9492 15.3233 23.871 26.9919 577.04 70510224
1.5522 0.2420 3000 1.4907 30.4033 16.1354 24.7244 28.5037 537.77 105707144
1.5059 0.3227 4000 1.4204 34.0294 19.2608 27.9322 32.3166 522.495 140722844
1.4346 0.4034 5000 1.3636 34.4104 19.4149 28.1022 32.7299 494.68 175639924
1.3912 0.4840 6000 1.3159 36.5059 21.2447 30.116 34.7303 469.885 210409328
1.3148 0.5647 7000 1.2807 37.0123 21.3666 30.11 35.0891 458.28 245601908
1.2859 0.6454 8000 1.2492 37.05 21.0468 29.7988 35.1882 452.495 280866724
1.298 0.7260 9000 1.2211 36.6966 20.8189 29.7115 34.7528 464.37 316042068
1.2834 0.8067 10000 1.1979 37.7181 20.9926 30.3857 35.8681 446.26 351056548
1.2577 0.8874 11000 1.1752 39.3539 23.0123 31.9005 37.4941 424.445 386471860
1.193 0.9680 12000 1.1526 40.1804 23.1008 32.3484 38.2103 422.225 421585440

Framework versions

  • Transformers 4.45.1
  • Pytorch 2.4.1+cu124
  • Datasets 3.0.1
  • Tokenizers 0.20.0