Spaces:
Runtime error
Runtime error
<!--- | |
Copyright 2021 The Google Flax Team Authors and HuggingFace Team. All rights reserved. | |
Licensed under the Apache License, Version 2.0 (the "License"); | |
you may not use this file except in compliance with the License. | |
You may obtain a copy of the License at | |
http://www.apache.org/licenses/LICENSE-2.0 | |
Unless required by applicable law or agreed to in writing, software | |
distributed under the License is distributed on an "AS IS" BASIS, | |
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | |
See the License for the specific language governing permissions and | |
limitations under the License. | |
--> | |
# Question Answering examples | |
Based on the script [`run_qa.py`](https://github.com/huggingface/transformers/blob/main/examples/flax/question-answering/run_qa.py). | |
**Note:** This script only works with models that have a fast tokenizer (backed by the 🤗 Tokenizers library) as it | |
uses special features of those tokenizers. You can check if your favorite model has a fast tokenizer in | |
[this table](https://huggingface.co/transformers/index.html#supported-frameworks), if it doesn't you can still use the old version | |
of the script. | |
The following example fine-tunes BERT on SQuAD: | |
```bash | |
python run_qa.py \ | |
--model_name_or_path bert-base-uncased \ | |
--dataset_name squad \ | |
--do_train \ | |
--do_eval \ | |
--max_seq_length 384 \ | |
--doc_stride 128 \ | |
--learning_rate 3e-5 \ | |
--num_train_epochs 2 \ | |
--per_device_train_batch_size 12 \ | |
--output_dir ./bert-qa-squad \ | |
--eval_steps 1000 \ | |
--push_to_hub | |
``` | |
Using the command above, the script will train for 2 epochs and run eval after each epoch. | |
Metrics and hyperparameters are stored in Tensorflow event files in `--output_dir`. | |
You can see the results by running `tensorboard` in that directory: | |
```bash | |
$ tensorboard --logdir . | |
``` | |
or directly on the hub under *Training metrics*. | |
Training with the previously defined hyper-parameters yields the following results: | |
```bash | |
f1 = 88.62 | |
exact_match = 81.34 | |
``` | |
sample Metrics - [tfhub.dev](https://tensorboard.dev/experiment/6gU75Hx8TGCnc6tr4ZgI9Q) | |
Here is an example training on 4 TITAN RTX GPUs and Bert Whole Word Masking uncased model to reach a F1 > 93 on SQuAD1.1: | |
```bash | |
export CUDA_VISIBLE_DEVICES=0,1,2,3 | |
python run_qa.py \ | |
--model_name_or_path bert-large-uncased-whole-word-masking \ | |
--dataset_name squad \ | |
--do_train \ | |
--do_eval \ | |
--per_device_train_batch_size 6 \ | |
--learning_rate 3e-5 \ | |
--num_train_epochs 2 \ | |
--max_seq_length 384 \ | |
--doc_stride 128 \ | |
--output_dir ./wwm_uncased_finetuned_squad/ \ | |
--eval_steps 1000 \ | |
--push_to_hub | |
``` | |
Training with the previously defined hyper-parameters yields the following results: | |
```bash | |
f1 = 93.31 | |
exact_match = 87.04 | |
``` | |
### Usage notes | |
Note that when contexts are long they may be split into multiple training cases, not all of which may contain | |
the answer span. | |
As-is, the example script will train on SQuAD or any other question-answering dataset formatted the same way, and can handle user | |
inputs as well. | |
### Memory usage and data loading | |
One thing to note is that all data is loaded into memory in this script. Most question answering datasets are small | |
enough that this is not an issue, but if you have a very large dataset you will need to modify the script to handle | |
data streaming. | |