metadata

license: apache-2.0
tags:
  - summarization
datasets:
  - philschmid/prompted-germanquad
widget:
  - text: >
      Philipp ist 26 Jahre alt und lebt in Nürnberg, Deutschland. Derzeit
      arbeitet er als Machine Learning Engineer und Tech Lead bei Hugging Face,
      um künstliche Intelligenz durch Open Source und Open Science zu
      demokratisieren.


      Welches Ziel hat Hugging Face?
metrics:
  - rouge
model-index:
  - name: mt5-small-prompted-germanquad-1
    results: []

mt5-small-prompted-germanquad-1

This model is a fine-tuned version of google/mt5-small on an philschmid/prompted-germanquad dataset. A prompt datasets using the BigScience PromptSource library. The dataset is a copy of germanquad with applying the squad template and translated it to german. TEMPLATE.

This is a first test if it is possible to fine-tune mt5 models to solve similar tasks than T0 of big science but for the German language.

It achieves the following results on the evaluation set:

Loss: 1.6835
Rouge1: 27.7309
Rouge2: 18.7311
Rougel: 27.4704
Rougelsum: 27.4818

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5.6e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
num_epochs: 7

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum
3.3795	1.0	17496	2.0693	15.8652	9.2569	15.6237	15.6142
2.3582	2.0	34992	1.9057	21.9348	14.0057	21.6769	21.6825
2.1809	3.0	52488	1.8143	24.3401	16.0354	24.0862	24.0914
2.0721	4.0	69984	1.7563	25.8672	17.2442	25.5854	25.6051
2.0004	5.0	87480	1.7152	27.0275	18.0548	26.7561	26.7685
1.9531	6.0	104976	1.6939	27.4702	18.5156	27.2027	27.2107
1.9218	7.0	122472	1.6835	27.7309	18.7311	27.4704	27.4818

Framework versions

Transformers 4.14.1
Pytorch 1.10.1+cu102
Datasets 1.16.1
Tokenizers 0.10.3