philippelaban commited on
Commit
90d24b3
1 Parent(s): 3ab36de

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +56 -0
README.md CHANGED
@@ -1,3 +1,59 @@
1
  ---
2
  license: apache-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+
5
+ # QA Consolidation Model
6
+ Model card for the QA Consolidation (step 3) of the Discord Questions framework (EMNLP 2022 - Findings).
7
+
8
+ The model is a RoBERTa-large model, finetuned on the [MOCHA dataset](https://arxiv.org/abs/2010.03636), and a 5-pt version of the [Answer Equivalence](https://arxiv.org/abs/2202.07654v1) dataset. For a (question, answer1, answer2)-tuple, the model outputs a [1-5] answer similarity score, where 5 is most similar.
9
+
10
+
11
+ Example usage of the model:
12
+ ```py
13
+ from transformers import AutoModelForSequenceClassification, AutoTokenizer
14
+ import itertools
15
+
16
+ ae_tokenizer = AutoTokenizer.from_pretrained("/export/share/plaban/models/qa_consolidation/")
17
+ ae_model = AutoModelForSequenceClassification.from_pretrained("/export/share/plaban/models/qa_consolidation/").eval()
18
+
19
+ question = "When will the recession happen?"
20
+ answers = ["probably next January", "never", "we're already in a recession", "it won't happen", "it's going on right now", "not before next year", "upcoming January-March"]
21
+ dataset = [{"a1": a1, "a2": a2, "input": "%s <sep> %s <sep> %s" % (question, a1, a2)} for a1, a2 in itertools.combinations(answers, 2)]
22
+
23
+ input_ids = ae_tokenizer.batch_encode_plus([d["input"] for d in dataset], add_special_tokens=False, padding=True, return_tensors="pt")["input_ids"]
24
+ scores = ae_model(input_ids=input_ids)["logits"][:, 0].tolist()
25
+ for d, score in zip(dataset, scores):
26
+ d["score"] = score
27
+
28
+ for d in sorted(dataset, key=lambda d: -d["score"]):
29
+ print("[Score: %.3f] %s" % (d["score"], d["input"]))
30
+ ```
31
+
32
+
33
+ The output then looks like:
34
+ ```
35
+ [Score: 4.980] When will the recession happen? <sep> never <sep> it won't happen
36
+ [Score: 3.831] When will the recession happen? <sep> probably next January <sep> upcoming January-March
37
+ [Score: 3.366] When will the recession happen? <sep> we're already in a recession <sep> it's going on right now
38
+ [Score: 2.302] When will the recession happen? <sep> never <sep> not before next year
39
+ [Score: 1.899] When will the recession happen? <sep> probably next January <sep> not before next year
40
+ [Score: 1.290] When will the recession happen? <sep> it won't happen <sep> not before next year
41
+ [Score: 1.230] When will the recession happen? <sep> we're already in a recession <sep> it won't happen
42
+ [Score: 1.187] When will the recession happen? <sep> not before next year <sep> upcoming January-March
43
+ [Score: 1.126] When will the recession happen? <sep> it won't happen <sep> it's going on right now
44
+ [Score: 1.108] When will the recession happen? <sep> never <sep> we're already in a recession
45
+ [Score: 1.099] When will the recession happen? <sep> we're already in a recession <sep> not before next year
46
+ [Score: 1.091] When will the recession happen? <sep> probably next January <sep> it's going on right now
47
+ [Score: 1.084] When will the recession happen? <sep> never <sep> it's going on right now
48
+ [Score: 1.048] When will the recession happen? <sep> probably next January <sep> we're already in a recession
49
+ [Score: 1.023] When will the recession happen? <sep> probably next January <sep> it won't happen
50
+ [Score: 1.017] When will the recession happen? <sep> probably next January <sep> never
51
+ [Score: 1.006] When will the recession happen? <sep> it's going on right now <sep> not before next year
52
+ [Score: 0.994] When will the recession happen? <sep> we're already in a recession <sep> upcoming January-March
53
+ [Score: 0.917] When will the recession happen? <sep> it's going on right now <sep> upcoming January-March
54
+ [Score: 0.903] When will the recession happen? <sep> it won't happen <sep> upcoming January-March
55
+ [Score: 0.896] When will the recession happen? <sep> never <sep> upcoming January-March
56
+ ```
57
+
58
+
59
+ In the paper, we find that a threshold of `T=2.75` achieves the highest F1 score on the validation portions of the two datasets. In the above example, only the first three pairs would be classified as equivalent answers, and all pairs below would be labeled as non-equivalent answers.