julien-c HF staff commited on
Commit
503bdb3
·
1 Parent(s): e9cc714

Migrate model card from transformers-repo

Browse files

Read announcement at https://discuss.huggingface.co/t/announcement-all-model-cards-will-be-migrated-to-hf-co-model-repos/2755
Original file history: https://github.com/huggingface/transformers/commits/master/model_cards/elgeish/cs224n-squad2.0-albert-xxlarge-v1/README.md

Files changed (1) hide show
  1. README.md +95 -0
README.md ADDED
@@ -0,0 +1,95 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - exbert
4
+ ---
5
+
6
+ ## CS224n SQuAD2.0 Project Dataset
7
+ The goal of this model is to save CS224n students GPU time when establishing
8
+ baselines to beat for the [Default Final Project](http://web.stanford.edu/class/cs224n/project/default-final-project-handout.pdf).
9
+ The training set used to fine-tune this model is the same as
10
+ the [official one](https://rajpurkar.github.io/SQuAD-explorer/); however,
11
+ evaluation and model selection were performed using roughly half of the official
12
+ dev set, 6078 examples, picked at random. The data files can be found at
13
+ <https://github.com/elgeish/squad/tree/master/data> — this is the Winter 2020
14
+ version. Given that the official SQuAD2.0 dev set contains the project's test
15
+ set, students must make sure not to use the official SQuAD2.0 dev set in any way
16
+ — including the use of models fine-tuned on the official SQuAD2.0, since they
17
+ used the official SQuAD2.0 dev set for model selection.
18
+
19
+ <a href="https://huggingface.co/exbert/?model=elgeish/cs224n-squad2.0-albert-xxlarge-v1">
20
+ <img width="300px" src="https://cdn-media.huggingface.co/exbert/button.png">
21
+ </a>
22
+
23
+ ## Results
24
+ ```json
25
+ {
26
+ "exact": 85.93287265547877,
27
+ "f1": 88.91258331187983,
28
+ "total": 6078,
29
+ "HasAns_exact": 84.36426116838489,
30
+ "HasAns_f1": 90.58786301361013,
31
+ "HasAns_total": 2910,
32
+ "NoAns_exact": 87.37373737373737,
33
+ "NoAns_f1": 87.37373737373737,
34
+ "NoAns_total": 3168,
35
+ "best_exact": 85.93287265547877,
36
+ "best_exact_thresh": 0.0,
37
+ "best_f1": 88.91258331187993,
38
+ "best_f1_thresh": 0.0
39
+ }
40
+ ```
41
+
42
+ ## Notable Arguments
43
+ ```json
44
+ {
45
+ "do_lower_case": true,
46
+ "doc_stride": 128,
47
+ "fp16": false,
48
+ "fp16_opt_level": "O1",
49
+ "gradient_accumulation_steps": 24,
50
+ "learning_rate": 3e-05,
51
+ "max_answer_length": 30,
52
+ "max_grad_norm": 1,
53
+ "max_query_length": 64,
54
+ "max_seq_length": 512,
55
+ "model_name_or_path": "albert-xxlarge-v1",
56
+ "model_type": "albert",
57
+ "num_train_epochs": 4,
58
+ "per_gpu_train_batch_size": 1,
59
+ "save_steps": 1000,
60
+ "seed": 42,
61
+ "train_batch_size": 1,
62
+ "version_2_with_negative": true,
63
+ "warmup_steps": 814,
64
+ "weight_decay": 0
65
+ }
66
+ ```
67
+
68
+ ## Environment Setup
69
+ ```json
70
+ {
71
+ "transformers": "2.5.1",
72
+ "pytorch": "1.4.0=py3.6_cuda10.1.243_cudnn7.6.3_0",
73
+ "python": "3.6.5=hc3d631a_2",
74
+ "os": "Linux 4.15.0-1060-aws #62-Ubuntu SMP Tue Feb 11 21:23:22 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux",
75
+ "gpu": "Tesla V100-SXM2-16GB"
76
+ }
77
+ ```
78
+
79
+ ## How to Cite
80
+ ```BibTeX
81
+ @misc{elgeish2020gestalt,
82
+ title={Gestalt: a Stacking Ensemble for SQuAD2.0},
83
+ author={Mohamed El-Geish},
84
+ journal={arXiv e-prints},
85
+ archivePrefix={arXiv},
86
+ eprint={2004.07067},
87
+ year={2020},
88
+ }
89
+ ```
90
+
91
+ ## Related Models
92
+ * [elgeish/cs224n-squad2.0-albert-base-v2](https://huggingface.co/elgeish/cs224n-squad2.0-albert-base-v2)
93
+ * [elgeish/cs224n-squad2.0-albert-large-v2](https://huggingface.co/elgeish/cs224n-squad2.0-albert-large-v2)
94
+ * [elgeish/cs224n-squad2.0-distilbert-base-uncased](https://huggingface.co/elgeish/cs224n-squad2.0-distilbert-base-uncased)
95
+ * [elgeish/cs224n-squad2.0-roberta-base](https://huggingface.co/elgeish/cs224n-squad2.0-roberta-base)