File size: 3,260 Bytes
d601245
 
4477d7e
 
 
 
 
 
 
 
 
 
7fe2ad3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d601245
4477d7e
 
 
 
 
 
 
 
 
 
 
 
 
ef02c99
4477d7e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7fe2ad3
 
ef02c99
7fe2ad3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ef02c99
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
---
license: mit
datasets:
- squad_v2
language:
- en
library_name: transformers
pipeline_tag: question-answering
tags:
- deberta
- deberta-v3
- question-answering
model-index:
- name: sjrhuschlee/deberta-v3-base-squad2
  results:
  - task:
      type: question-answering
      name: Question Answering
    dataset:
      name: squad_v2
      type: squad_v2
      config: squad_v2
      split: validation
    metrics:
    - type: exact_match
      value: 85.648
      name: Exact Match
    - type: f1
      value: 88.728
      name: F1
  - task:
      type: question-answering
      name: Question Answering
    dataset:
      name: squad
      type: squad
      config: plain_text
      split: validation
    metrics:
    - type: exact_match
      value: 87.862
      name: Exact Match
    - type: f1
      value: 93.924
      name: F1
---

# deberta-v3-base for QA

This is the [deberta-v3-base](https://huggingface.co/microsoft/deberta-v3-base) model, fine-tuned using the [SQuAD2.0](https://huggingface.co/datasets/squad_v2) dataset. It's been trained on question-answer pairs, including unanswerable questions, for the task of Extractive Question Answering.

## Overview
**Language model:** deberta-v3-base  
**Language:** English  
**Downstream-task:** Extractive QA  
**Training data:** SQuAD 2.0  
**Eval data:** SQuAD 2.0  
**Infrastructure**: 1x NVIDIA 3070  

## Model Usage
```python
from transformers import AutoModelForQuestionAnswering, AutoTokenizer, pipeline
model_name = "sjrhuschlee/deberta-v3-base-squad2"

# a) Using pipelines
nlp = pipeline('question-answering', model=model_name, tokenizer=model_name)
qa_input = {
'question': 'Where do I live?',
'context': 'My name is Sarah and I live in London'
}
res = nlp(qa_input)

# b) Load model & tokenizer
model = AutoModelForQuestionAnswering.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)
```

## Metrics

```bash
# Squad v2
{
    "eval_HasAns_exact": 82.72604588394061,
    "eval_HasAns_f1": 88.89430905100325,
    "eval_HasAns_total": 5928,
    "eval_NoAns_exact": 88.56181665264928,
    "eval_NoAns_f1": 88.56181665264928,
    "eval_NoAns_total": 5945,
    "eval_best_exact": 85.64810915522614,
    "eval_best_exact_thresh": 0.0,
    "eval_best_f1": 88.72782481717712,
    "eval_best_f1_thresh": 0.0,
    "eval_exact": 85.64810915522614,
    "eval_f1": 88.72782481717726,
    "eval_runtime": 219.6226,
    "eval_samples": 11951,
    "eval_samples_per_second": 54.416,
    "eval_steps_per_second": 2.268,
    "eval_total": 11873
}

# Squad
{
    "eval_exact_match": 87.86187322611164,
    "eval_f1": 93.92373735474943,
    "eval_runtime": 195.2115,
    "eval_samples": 10618,
    "eval_samples_per_second": 54.392,
    "eval_steps_per_second": 2.269
}
```

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 5e-06
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 8
- total_train_batch_size: 64
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 4.0

### Framework versions

- Transformers 4.30.0.dev0
- Pytorch 2.0.1+cu117
- Datasets 2.12.0
- Tokenizers 0.13.3