Spaces:
Running
Running
Update Space (evaluate main: e51c679b)
Browse files- README.md +13 -13
- exact_match.py +12 -13
README.md
CHANGED
@@ -29,7 +29,7 @@ The exact match score of a set of predictions is the sum of all of the individua
|
|
29 |
## How to Use
|
30 |
At minimum, this metric takes as input predictions and references:
|
31 |
```python
|
32 |
-
>>> from
|
33 |
>>> exact_match_metric = load("exact_match")
|
34 |
>>> results = exact_match_metric.compute(predictions=predictions, references=references)
|
35 |
```
|
@@ -47,10 +47,10 @@ At minimum, this metric takes as input predictions and references:
|
|
47 |
This metric outputs a dictionary with one value: the average exact match score.
|
48 |
|
49 |
```python
|
50 |
-
{'exact_match':
|
51 |
```
|
52 |
|
53 |
-
This metric's range is 0-
|
54 |
|
55 |
#### Values from Popular Papers
|
56 |
The exact match metric is often included in other metrics, such as SQuAD. For example, the [original SQuAD paper](https://nlp.stanford.edu/pubs/rajpurkar2016squad.pdf) reported an Exact Match score of 40.0%. They also report that the human performance Exact Match score on the dataset was 80.3%.
|
@@ -62,8 +62,8 @@ Without including any regexes to ignore:
|
|
62 |
>>> refs = ["the cat", "theater", "YELLING", "agent007"]
|
63 |
>>> preds = ["cat?", "theater", "yelling", "agent"]
|
64 |
>>> results = exact_match.compute(references=refs, predictions=preds)
|
65 |
-
>>> print(round(results["exact_match"],
|
66 |
-
25
|
67 |
```
|
68 |
|
69 |
Ignoring regexes "the" and "yell", as well as ignoring case and punctuation:
|
@@ -72,8 +72,8 @@ Ignoring regexes "the" and "yell", as well as ignoring case and punctuation:
|
|
72 |
>>> refs = ["the cat", "theater", "YELLING", "agent007"]
|
73 |
>>> preds = ["cat?", "theater", "yelling", "agent"]
|
74 |
>>> results = exact_match.compute(references=refs, predictions=preds, regexes_to_ignore=["the ", "yell"], ignore_case=True, ignore_punctuation=True)
|
75 |
-
>>> print(round(results["exact_match"],
|
76 |
-
|
77 |
```
|
78 |
Note that in the example above, because the regexes are ignored before the case is normalized, "yell" from "YELLING" is not deleted.
|
79 |
|
@@ -83,8 +83,8 @@ Ignoring "the", "yell", and "YELL", as well as ignoring case and punctuation:
|
|
83 |
>>> refs = ["the cat", "theater", "YELLING", "agent007"]
|
84 |
>>> preds = ["cat?", "theater", "yelling", "agent"]
|
85 |
>>> results = exact_match.compute(references=refs, predictions=preds, regexes_to_ignore=["the ", "yell", "YELL"], ignore_case=True, ignore_punctuation=True)
|
86 |
-
>>> print(round(results["exact_match"],
|
87 |
-
75
|
88 |
```
|
89 |
|
90 |
Ignoring "the", "yell", and "YELL", as well as ignoring case, punctuation, and numbers:
|
@@ -93,8 +93,8 @@ Ignoring "the", "yell", and "YELL", as well as ignoring case, punctuation, and n
|
|
93 |
>>> refs = ["the cat", "theater", "YELLING", "agent007"]
|
94 |
>>> preds = ["cat?", "theater", "yelling", "agent"]
|
95 |
>>> results = exact_match.compute(references=refs, predictions=preds, regexes_to_ignore=["the ", "yell", "YELL"], ignore_case=True, ignore_punctuation=True, ignore_numbers=True)
|
96 |
-
>>> print(round(results["exact_match"],
|
97 |
-
|
98 |
```
|
99 |
|
100 |
An example that includes sentences:
|
@@ -103,8 +103,8 @@ An example that includes sentences:
|
|
103 |
>>> refs = ["The cat sat on the mat.", "Theaters are great.", "It's like comparing oranges and apples."]
|
104 |
>>> preds = ["The cat sat on the mat?", "Theaters are great.", "It's like comparing apples and oranges."]
|
105 |
>>> results = exact_match.compute(references=refs, predictions=preds)
|
106 |
-
>>> print(round(results["exact_match"],
|
107 |
-
33
|
108 |
```
|
109 |
|
110 |
|
|
|
29 |
## How to Use
|
30 |
At minimum, this metric takes as input predictions and references:
|
31 |
```python
|
32 |
+
>>> from evaluate import load
|
33 |
>>> exact_match_metric = load("exact_match")
|
34 |
>>> results = exact_match_metric.compute(predictions=predictions, references=references)
|
35 |
```
|
|
|
47 |
This metric outputs a dictionary with one value: the average exact match score.
|
48 |
|
49 |
```python
|
50 |
+
{'exact_match': 1.0}
|
51 |
```
|
52 |
|
53 |
+
This metric's range is 0-1, inclusive. Here, 0.0 means no prediction/reference pairs were matches, while 1.0 means they all were.
|
54 |
|
55 |
#### Values from Popular Papers
|
56 |
The exact match metric is often included in other metrics, such as SQuAD. For example, the [original SQuAD paper](https://nlp.stanford.edu/pubs/rajpurkar2016squad.pdf) reported an Exact Match score of 40.0%. They also report that the human performance Exact Match score on the dataset was 80.3%.
|
|
|
62 |
>>> refs = ["the cat", "theater", "YELLING", "agent007"]
|
63 |
>>> preds = ["cat?", "theater", "yelling", "agent"]
|
64 |
>>> results = exact_match.compute(references=refs, predictions=preds)
|
65 |
+
>>> print(round(results["exact_match"], 2))
|
66 |
+
0.25
|
67 |
```
|
68 |
|
69 |
Ignoring regexes "the" and "yell", as well as ignoring case and punctuation:
|
|
|
72 |
>>> refs = ["the cat", "theater", "YELLING", "agent007"]
|
73 |
>>> preds = ["cat?", "theater", "yelling", "agent"]
|
74 |
>>> results = exact_match.compute(references=refs, predictions=preds, regexes_to_ignore=["the ", "yell"], ignore_case=True, ignore_punctuation=True)
|
75 |
+
>>> print(round(results["exact_match"], 2))
|
76 |
+
0.5
|
77 |
```
|
78 |
Note that in the example above, because the regexes are ignored before the case is normalized, "yell" from "YELLING" is not deleted.
|
79 |
|
|
|
83 |
>>> refs = ["the cat", "theater", "YELLING", "agent007"]
|
84 |
>>> preds = ["cat?", "theater", "yelling", "agent"]
|
85 |
>>> results = exact_match.compute(references=refs, predictions=preds, regexes_to_ignore=["the ", "yell", "YELL"], ignore_case=True, ignore_punctuation=True)
|
86 |
+
>>> print(round(results["exact_match"], 2))
|
87 |
+
0.75
|
88 |
```
|
89 |
|
90 |
Ignoring "the", "yell", and "YELL", as well as ignoring case, punctuation, and numbers:
|
|
|
93 |
>>> refs = ["the cat", "theater", "YELLING", "agent007"]
|
94 |
>>> preds = ["cat?", "theater", "yelling", "agent"]
|
95 |
>>> results = exact_match.compute(references=refs, predictions=preds, regexes_to_ignore=["the ", "yell", "YELL"], ignore_case=True, ignore_punctuation=True, ignore_numbers=True)
|
96 |
+
>>> print(round(results["exact_match"], 2))
|
97 |
+
1.0
|
98 |
```
|
99 |
|
100 |
An example that includes sentences:
|
|
|
103 |
>>> refs = ["The cat sat on the mat.", "Theaters are great.", "It's like comparing oranges and apples."]
|
104 |
>>> preds = ["The cat sat on the mat?", "Theaters are great.", "It's like comparing apples and oranges."]
|
105 |
>>> results = exact_match.compute(references=refs, predictions=preds)
|
106 |
+
>>> print(round(results["exact_match"], 2))
|
107 |
+
0.33
|
108 |
```
|
109 |
|
110 |
|
exact_match.py
CHANGED
@@ -40,44 +40,43 @@ Args:
|
|
40 |
ignore_numbers: Boolean, defaults to False. If true, removes all punctuation before
|
41 |
comparing predictions and references.
|
42 |
Returns:
|
43 |
-
exact_match: Dictionary containing exact_match rate. Possible values are between 0.0 and
|
44 |
Examples:
|
45 |
>>> exact_match = evaluate.load("exact_match")
|
46 |
>>> refs = ["the cat", "theater", "YELLING", "agent007"]
|
47 |
>>> preds = ["cat?", "theater", "yelling", "agent"]
|
48 |
>>> results = exact_match.compute(references=refs, predictions=preds)
|
49 |
-
>>> print(round(results["exact_match"],
|
50 |
-
25
|
51 |
|
52 |
>>> exact_match = evaluate.load("exact_match")
|
53 |
>>> refs = ["the cat", "theater", "YELLING", "agent007"]
|
54 |
>>> preds = ["cat?", "theater", "yelling", "agent"]
|
55 |
>>> results = exact_match.compute(references=refs, predictions=preds, regexes_to_ignore=["the ", "yell"], ignore_case=True, ignore_punctuation=True)
|
56 |
-
>>> print(round(results["exact_match"],
|
57 |
-
|
58 |
|
59 |
|
60 |
>>> exact_match = evaluate.load("exact_match")
|
61 |
>>> refs = ["the cat", "theater", "YELLING", "agent007"]
|
62 |
>>> preds = ["cat?", "theater", "yelling", "agent"]
|
63 |
>>> results = exact_match.compute(references=refs, predictions=preds, regexes_to_ignore=["the ", "yell", "YELL"], ignore_case=True, ignore_punctuation=True)
|
64 |
-
>>> print(round(results["exact_match"],
|
65 |
-
75
|
66 |
|
67 |
>>> exact_match = evaluate.load("exact_match")
|
68 |
>>> refs = ["the cat", "theater", "YELLING", "agent007"]
|
69 |
>>> preds = ["cat?", "theater", "yelling", "agent"]
|
70 |
>>> results = exact_match.compute(references=refs, predictions=preds, regexes_to_ignore=["the ", "yell", "YELL"], ignore_case=True, ignore_punctuation=True, ignore_numbers=True)
|
71 |
-
>>> print(round(results["exact_match"],
|
72 |
-
|
73 |
|
74 |
>>> exact_match = evaluate.load("exact_match")
|
75 |
>>> refs = ["The cat sat on the mat.", "Theaters are great.", "It's like comparing oranges and apples."]
|
76 |
>>> preds = ["The cat sat on the mat?", "Theaters are great.", "It's like comparing apples and oranges."]
|
77 |
>>> results = exact_match.compute(references=refs, predictions=preds)
|
78 |
-
>>> print(round(results["exact_match"],
|
79 |
-
33
|
80 |
-
|
81 |
"""
|
82 |
|
83 |
_CITATION = """
|
@@ -134,4 +133,4 @@ class ExactMatch(evaluate.EvaluationModule):
|
|
134 |
|
135 |
score_list = predictions == references
|
136 |
|
137 |
-
return {"exact_match": np.mean(score_list)
|
|
|
40 |
ignore_numbers: Boolean, defaults to False. If true, removes all punctuation before
|
41 |
comparing predictions and references.
|
42 |
Returns:
|
43 |
+
exact_match: Dictionary containing exact_match rate. Possible values are between 0.0 and 1.0, inclusive.
|
44 |
Examples:
|
45 |
>>> exact_match = evaluate.load("exact_match")
|
46 |
>>> refs = ["the cat", "theater", "YELLING", "agent007"]
|
47 |
>>> preds = ["cat?", "theater", "yelling", "agent"]
|
48 |
>>> results = exact_match.compute(references=refs, predictions=preds)
|
49 |
+
>>> print(round(results["exact_match"], 2))
|
50 |
+
0.25
|
51 |
|
52 |
>>> exact_match = evaluate.load("exact_match")
|
53 |
>>> refs = ["the cat", "theater", "YELLING", "agent007"]
|
54 |
>>> preds = ["cat?", "theater", "yelling", "agent"]
|
55 |
>>> results = exact_match.compute(references=refs, predictions=preds, regexes_to_ignore=["the ", "yell"], ignore_case=True, ignore_punctuation=True)
|
56 |
+
>>> print(round(results["exact_match"], 2))
|
57 |
+
0.5
|
58 |
|
59 |
|
60 |
>>> exact_match = evaluate.load("exact_match")
|
61 |
>>> refs = ["the cat", "theater", "YELLING", "agent007"]
|
62 |
>>> preds = ["cat?", "theater", "yelling", "agent"]
|
63 |
>>> results = exact_match.compute(references=refs, predictions=preds, regexes_to_ignore=["the ", "yell", "YELL"], ignore_case=True, ignore_punctuation=True)
|
64 |
+
>>> print(round(results["exact_match"], 2))
|
65 |
+
0.75
|
66 |
|
67 |
>>> exact_match = evaluate.load("exact_match")
|
68 |
>>> refs = ["the cat", "theater", "YELLING", "agent007"]
|
69 |
>>> preds = ["cat?", "theater", "yelling", "agent"]
|
70 |
>>> results = exact_match.compute(references=refs, predictions=preds, regexes_to_ignore=["the ", "yell", "YELL"], ignore_case=True, ignore_punctuation=True, ignore_numbers=True)
|
71 |
+
>>> print(round(results["exact_match"], 2))
|
72 |
+
1.0
|
73 |
|
74 |
>>> exact_match = evaluate.load("exact_match")
|
75 |
>>> refs = ["The cat sat on the mat.", "Theaters are great.", "It's like comparing oranges and apples."]
|
76 |
>>> preds = ["The cat sat on the mat?", "Theaters are great.", "It's like comparing apples and oranges."]
|
77 |
>>> results = exact_match.compute(references=refs, predictions=preds)
|
78 |
+
>>> print(round(results["exact_match"], 2))
|
79 |
+
0.33
|
|
|
80 |
"""
|
81 |
|
82 |
_CITATION = """
|
|
|
133 |
|
134 |
score_list = predictions == references
|
135 |
|
136 |
+
return {"exact_match": np.mean(score_list)}
|