Graphcore
/

roberta-base-squad

Question Answering

Generated from Trainer

Model card Files Files and versions Community

jimypbr commited on Mar 23, 2022

Commit

d5da0f8

·

1 Parent(s): 3e7eb9c

Update README.md

Files changed (1) hide show

README.md +47 -2

README.md CHANGED Viewed

@@ -26,10 +26,41 @@ More information needed
 ## Training and evaluation data
-More information needed
 ## Training procedure
 ### Training hyperparameters
 The following hyperparameters were used during training:
@@ -48,7 +79,21 @@ The following hyperparameters were used during training:
 ### Training results
 ### Framework versions

 ## Training and evaluation data
+Trained and evaluated on the [squad dataset](https://huggingface.co/datasets/squad).
 ## Training procedure
+Trained on 16 Graphcore Mk2 IPUs using [optimum-graphcore](https://github.com/huggingface/optimum-graphcore).
+Command line:
+```
+python examples/question-answering/run_qa.py \
+  --ipu_config_name Graphcore/roberta-base-ipu \
+  --model_name_or_path roberta-base \
+  --dataset_name squad \
+  --do_train \
+  --do_eval \
+  --num_train_epochs 2 \
+  --per_device_train_batch_size 4 \
+  --per_device_eval_batch_size 2 \
+  --pod_type pod16 \
+  --learning_rate 6e-5 \
+  --max_seq_length 384 \
+  --doc_stride 128 \
+  --seed 1984 \
+  --lr_scheduler_type linear \
+  --loss_scaling 64 \
+  --weight_decay 0.01 \
+  --warmup_ratio 0.25 \
+  --logging_steps 1 \
+  --save_steps -1 \
+  --dataloader_num_workers 64 \
+  --output_dir squad_roberta_base \
+  --overwrite_output_dir \
+  --push_to_hub
+```
 ### Training hyperparameters
 The following hyperparameters were used during training:
 ### Training results
+```
+***** train metrics *****
+  epoch                    =        2.0
+  train_loss               =     1.2528
+  train_runtime            = 0:02:14.50
+  train_samples            =      88568
+  train_samples_per_second =   1316.952
+  train_steps_per_second   =       5.13
+***** eval metrics *****
+  epoch            =     2.0
+  eval_exact_match = 85.2696
+  eval_f1          = 91.7455
+  eval_samples     =   10790
+```
 ### Framework versions