Edit model card

llamantino7b_2_10_question-answering

This model is a fine-tuned version of swap-uniba/LLaMAntino-2-7b-hf-ITA on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.9768
  • Rouge1: 19.35
  • Rouge2: 10.89
  • Rougel: 18.01
  • Rougelsum: 18.19
  • R: 16.06
  • Gen Len: 1.0
  • R@1: 0.0
  • R@3: 0.0
  • R@5: 0.0
  • R@10: 0.0
  • R@20: 0.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.03
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum R Gen Len R@1 R@3 R@5 R@10 R@20
1.7082 1.0 23 1.4263 16.0 5.83 14.16 14.7 11.97 1.0 0.0 0.0 0.0 0.0 0.0
1.1279 2.0 46 1.3286 17.65 8.25 16.23 16.85 14.02 1.0 0.0 0.0 0.0 0.0 0.0
0.6508 3.0 69 1.3710 18.49 9.03 16.79 17.44 14.74 1.0 0.0 0.0 0.0 0.0 0.0
0.3175 4.0 92 1.4644 19.18 9.83 17.33 17.52 15.42 1.0 0.0 0.0 0.0 0.0 0.0
0.1619 5.0 115 1.5859 18.93 10.18 17.36 17.62 15.47 1.0 0.0 0.0 0.0 0.0 0.0
0.0925 6.0 138 1.6658 19.06 10.78 17.65 17.86 15.81 1.0 0.0 0.0 0.0 0.0 0.0
0.0571 7.0 161 1.7225 18.85 10.28 17.28 17.54 15.45 1.0 0.0 0.0 0.0 0.0 0.0
0.0279 8.0 184 1.8907 18.94 10.96 17.61 17.74 15.82 1.0 0.0 0.0 0.0 0.0 0.0
0.0172 9.0 207 1.9558 19.32 10.79 18.01 18.18 16.02 1.0 0.0 0.0 0.0 0.0 0.0
0.0127 10.0 230 1.9768 19.35 10.89 18.01 18.19 16.06 1.0 0.0 0.0 0.0 0.0 0.0

Framework versions

  • PEFT 0.8.2
  • Transformers 4.38.0.dev0
  • Pytorch 2.0.1+cu117
  • Datasets 2.16.1
  • Tokenizers 0.15.2
Downloads last month
1
Safetensors
Model size
3.6B params
Tensor type
F32
FP16
U8
Inference API
Unable to determine this model鈥檚 pipeline type. Check the docs .

Model tree for lvcalucioli/llamantino7b_2_10_question-answering

Adapter
(8)
this model