Commit
•
e37802e
1
Parent(s):
06a006a
Add evaluation results on the 3.0.0 config and test split of cnn_dailymail
Browse filesBeep boop, I am a bot from Hugging Face's automatic model evaluator 👋!\
Your model has been evaluated on the 3.0.0 config and test split of the [cnn_dailymail](https://huggingface.co/datasets/cnn_dailymail) dataset by
@jayeeap
, using the predictions stored [here](https://huggingface.co/datasets/autoevaluate/autoeval-eval-cnn_dailymail-3.0.0-73237a-43943145136).\
Accept this pull request to see the results displayed on the [Hub leaderboard](https://huggingface.co/spaces/autoevaluate/leaderboards?dataset=cnn_dailymail).\
Evaluate your model on more datasets [here](https://huggingface.co/spaces/autoevaluate/model-evaluator?dataset=cnn_dailymail).
README.md
CHANGED
@@ -6,12 +6,13 @@ datasets:
|
|
6 |
- summarize_from_feedback
|
7 |
metrics:
|
8 |
- rouge
|
|
|
9 |
model-index:
|
10 |
- name: flan-t5-large-finetuned-openai-summarize_from_feedback
|
11 |
results:
|
12 |
- task:
|
13 |
-
name: Sequence-to-sequence Language Modeling
|
14 |
type: text2text-generation
|
|
|
15 |
dataset:
|
16 |
name: summarize_from_feedback
|
17 |
type: summarize_from_feedback
|
@@ -19,19 +20,57 @@ model-index:
|
|
19 |
split: train
|
20 |
args: comparisons
|
21 |
metrics:
|
22 |
-
-
|
23 |
-
type: rouge
|
24 |
value: 30.2401
|
25 |
-
|
26 |
-
|
27 |
value: 11.4916
|
28 |
-
|
29 |
-
|
30 |
value: 24.6485
|
31 |
-
|
32 |
-
|
33 |
value: 26.1801
|
34 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
35 |
---
|
36 |
|
37 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
|
|
6 |
- summarize_from_feedback
|
7 |
metrics:
|
8 |
- rouge
|
9 |
+
pipeline_tag: summarization
|
10 |
model-index:
|
11 |
- name: flan-t5-large-finetuned-openai-summarize_from_feedback
|
12 |
results:
|
13 |
- task:
|
|
|
14 |
type: text2text-generation
|
15 |
+
name: Sequence-to-sequence Language Modeling
|
16 |
dataset:
|
17 |
name: summarize_from_feedback
|
18 |
type: summarize_from_feedback
|
|
|
20 |
split: train
|
21 |
args: comparisons
|
22 |
metrics:
|
23 |
+
- type: rouge
|
|
|
24 |
value: 30.2401
|
25 |
+
name: Rouge1
|
26 |
+
- type: rouge
|
27 |
value: 11.4916
|
28 |
+
name: Rouge2
|
29 |
+
- type: rouge
|
30 |
value: 24.6485
|
31 |
+
name: RougeL
|
32 |
+
- type: rouge
|
33 |
value: 26.1801
|
34 |
+
name: RougeLSum
|
35 |
+
- task:
|
36 |
+
type: summarization
|
37 |
+
name: Summarization
|
38 |
+
dataset:
|
39 |
+
name: cnn_dailymail
|
40 |
+
type: cnn_dailymail
|
41 |
+
config: 3.0.0
|
42 |
+
split: test
|
43 |
+
metrics:
|
44 |
+
- type: rouge
|
45 |
+
value: 23.0407
|
46 |
+
name: ROUGE-1
|
47 |
+
verified: true
|
48 |
+
verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZmViMTI3Mzg3ZDlkMDJlMjBiODU1NTIyNmVmOWFjMzIzZDVkNTc4NWI0MGIzYTJmYmUyMDM4ZTk2Y2Q5YzVjNSIsInZlcnNpb24iOjF9.oyqMHGGnGCE1f3JUNBg9c2ThycvlecuoZVWcGXvOcm0SbenpBobLEnczlFb4qx3ySwDUsL7uVtFW7F46Lz_CAg
|
49 |
+
- type: rouge
|
50 |
+
value: 8.5384
|
51 |
+
name: ROUGE-2
|
52 |
+
verified: true
|
53 |
+
verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiOWZmOGYzODgyN2VjNjlmMWZiYzY3ZDIwNjAyZGExNDc3ZmZmMzIwMzk2NGVlNTZiMWIwZDUxM2IwNWU4OWQ5ZSIsInZlcnNpb24iOjF9.dCSVXAMFQQASe6fpPEOJu3Cfsd6Adm1L53xF0Job6W2Qd78M91wfl0715sUiFpsEKWKN9Z9bnhGA7d-SScVSAA
|
54 |
+
- type: rouge
|
55 |
+
value: 17.6719
|
56 |
+
name: ROUGE-L
|
57 |
+
verified: true
|
58 |
+
verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZjkwOTA5YzNmYTc2YmRhNTFmYzc4MmE1ZjI5YmE1NDJiNjliNWRmNWU3MTg3MDgxY2RjZTZkMGZkZDY0MTIzNyIsInZlcnNpb24iOjF9.8-0VXD5ZGKIAGhjvuiBAchDZxyVWKczwiBxWDIQEItT3egSjYefGN8eOo9Z7R7sToX_li7IPeajVl3PbrQgPCg
|
59 |
+
- type: rouge
|
60 |
+
value: 20.9526
|
61 |
+
name: ROUGE-LSUM
|
62 |
+
verified: true
|
63 |
+
verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZGNkYjdiMzE4MWQxZmIyMjYzM2Y0MDU3YmFlZDg1ZmE5NTBiZTVhM2NkYTRjMmU1ZWE5MmUwMmI2NmM5ODZkMyIsInZlcnNpb24iOjF9.qjFjlfIVHew80u5t_U44n9J6_PufNyv3faHaqML_pgo3VYBrbZWHX75jnBHThueWSK2hQhhmwaxSmR4ZYndRBA
|
64 |
+
- type: loss
|
65 |
+
value: 2.6858959197998047
|
66 |
+
name: loss
|
67 |
+
verified: true
|
68 |
+
verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiNWEwMzJiODEzY2Y0OTE3NzAyMzM1MjAxYzJjN2Y2ZGU3MGQ0MjFmZDg5ZGFlNGQ1YjJlN2I5MTFhNDE3NjZkYyIsInZlcnNpb24iOjF9.cPkbHIU3UQYMF7gUZx9Iqu-265jgv7pcgedRdVEsvxq2gfgdlyFDROQK9KI2cfk4GbQogsXEca91NdKohWFtCA
|
69 |
+
- type: gen_len
|
70 |
+
value: 18.9249
|
71 |
+
name: gen_len
|
72 |
+
verified: true
|
73 |
+
verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiOTM3M2M0YjIzMTNmNTg2OTVmNDJhZjEwODEzNTBkODk0N2E0ZTZjNjg4MWY1OTk1ZGMzZTRmNzVkN2Y2ZDE4NyIsInZlcnNpb24iOjF9.Qo6HhXLr-j-aPKRL3ZVdMJMwTCQAUhAPUcLlN-2lGSS9tAoxVJEYr0O8SttMWbBDw3owivQdduxVre9SGUKuBQ
|
74 |
---
|
75 |
|
76 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|