Update README.md
Browse files
README.md
CHANGED
@@ -26,10 +26,41 @@ More information needed
|
|
26 |
|
27 |
## Training and evaluation data
|
28 |
|
29 |
-
|
30 |
|
31 |
## Training procedure
|
32 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
33 |
### Training hyperparameters
|
34 |
|
35 |
The following hyperparameters were used during training:
|
@@ -48,7 +79,21 @@ The following hyperparameters were used during training:
|
|
48 |
|
49 |
### Training results
|
50 |
|
51 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
52 |
|
53 |
### Framework versions
|
54 |
|
|
|
26 |
|
27 |
## Training and evaluation data
|
28 |
|
29 |
+
Trained and evaluated on the [squad dataset](https://huggingface.co/datasets/squad).
|
30 |
|
31 |
## Training procedure
|
32 |
|
33 |
+
Trained on 16 Graphcore Mk2 IPUs using [optimum-graphcore](https://github.com/huggingface/optimum-graphcore).
|
34 |
+
|
35 |
+
Command line:
|
36 |
+
|
37 |
+
```
|
38 |
+
python examples/question-answering/run_qa.py \
|
39 |
+
--ipu_config_name Graphcore/roberta-base-ipu \
|
40 |
+
--model_name_or_path roberta-base \
|
41 |
+
--dataset_name squad \
|
42 |
+
--do_train \
|
43 |
+
--do_eval \
|
44 |
+
--num_train_epochs 2 \
|
45 |
+
--per_device_train_batch_size 4 \
|
46 |
+
--per_device_eval_batch_size 2 \
|
47 |
+
--pod_type pod16 \
|
48 |
+
--learning_rate 6e-5 \
|
49 |
+
--max_seq_length 384 \
|
50 |
+
--doc_stride 128 \
|
51 |
+
--seed 1984 \
|
52 |
+
--lr_scheduler_type linear \
|
53 |
+
--loss_scaling 64 \
|
54 |
+
--weight_decay 0.01 \
|
55 |
+
--warmup_ratio 0.25 \
|
56 |
+
--logging_steps 1 \
|
57 |
+
--save_steps -1 \
|
58 |
+
--dataloader_num_workers 64 \
|
59 |
+
--output_dir squad_roberta_base \
|
60 |
+
--overwrite_output_dir \
|
61 |
+
--push_to_hub
|
62 |
+
```
|
63 |
+
|
64 |
### Training hyperparameters
|
65 |
|
66 |
The following hyperparameters were used during training:
|
|
|
79 |
|
80 |
### Training results
|
81 |
|
82 |
+
```
|
83 |
+
***** train metrics *****
|
84 |
+
epoch = 2.0
|
85 |
+
train_loss = 1.2528
|
86 |
+
train_runtime = 0:02:14.50
|
87 |
+
train_samples = 88568
|
88 |
+
train_samples_per_second = 1316.952
|
89 |
+
train_steps_per_second = 5.13
|
90 |
+
|
91 |
+
***** eval metrics *****
|
92 |
+
epoch = 2.0
|
93 |
+
eval_exact_match = 85.2696
|
94 |
+
eval_f1 = 91.7455
|
95 |
+
eval_samples = 10790
|
96 |
+
```
|
97 |
|
98 |
### Framework versions
|
99 |
|