ArielUW's picture
Update README.md
43d671d verified
|
raw
history blame
2.02 kB
metadata
library_name: transformers
license: mit
base_model: facebook/m2m100_1.2B
tags:
  - generated_from_trainer
metrics:
  - bleu
model-index:
  - name: model_output
    results: []
datasets:
  - ArielUW/jobtitles

model_output

This model is a fine-tuned version of facebook/m2m100_1.2B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.7263
  • Bleu: 93.9441
  • Gen Len: 36.358

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 128
  • optimizer: Use OptimizerNames.ADAFACTOR and the args are: No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 7
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
23.051 1.0 38 4.3445 89.5045 35.746
15.9099 2.0 76 3.5044 91.9617 36.366
12.7846 3.0 114 2.8211 92.7676 36.22
10.3083 4.0 152 2.3006 93.675 36.284
8.4622 5.0 190 1.9316 93.6498 36.348
7.3015 6.0 228 1.7263 93.9441 36.358
6.8211 6.8212 259 1.6685 93.7274 36.306

Framework versions

  • Transformers 4.47.1
  • Pytorch 2.5.1+cu124
  • Datasets 3.2.0
  • Tokenizers 0.21.0