autoevaluator HF staff commited on
Commit
e37802e
1 Parent(s): 06a006a

Add evaluation results on the 3.0.0 config and test split of cnn_dailymail

Browse files

Beep boop, I am a bot from Hugging Face's automatic model evaluator 👋!\
Your model has been evaluated on the 3.0.0 config and test split of the [cnn_dailymail](https://huggingface.co/datasets/cnn_dailymail) dataset by

@jayeeap

, using the predictions stored [here](https://huggingface.co/datasets/autoevaluate/autoeval-eval-cnn_dailymail-3.0.0-73237a-43943145136).\
Accept this pull request to see the results displayed on the [Hub leaderboard](https://huggingface.co/spaces/autoevaluate/leaderboards?dataset=cnn_dailymail).\
Evaluate your model on more datasets [here](https://huggingface.co/spaces/autoevaluate/model-evaluator?dataset=cnn_dailymail).

Files changed (1) hide show
  1. README.md +49 -10
README.md CHANGED
@@ -6,12 +6,13 @@ datasets:
6
  - summarize_from_feedback
7
  metrics:
8
  - rouge
 
9
  model-index:
10
  - name: flan-t5-large-finetuned-openai-summarize_from_feedback
11
  results:
12
  - task:
13
- name: Sequence-to-sequence Language Modeling
14
  type: text2text-generation
 
15
  dataset:
16
  name: summarize_from_feedback
17
  type: summarize_from_feedback
@@ -19,19 +20,57 @@ model-index:
19
  split: train
20
  args: comparisons
21
  metrics:
22
- - name: Rouge1
23
- type: rouge
24
  value: 30.2401
25
- - name: Rouge2
26
- type: rouge
27
  value: 11.4916
28
- - name: RougeL
29
- type: rouge
30
  value: 24.6485
31
- - name: RougeLSum
32
- type: rouge
33
  value: 26.1801
34
- pipeline_tag: summarization
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
35
  ---
36
 
37
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
6
  - summarize_from_feedback
7
  metrics:
8
  - rouge
9
+ pipeline_tag: summarization
10
  model-index:
11
  - name: flan-t5-large-finetuned-openai-summarize_from_feedback
12
  results:
13
  - task:
 
14
  type: text2text-generation
15
+ name: Sequence-to-sequence Language Modeling
16
  dataset:
17
  name: summarize_from_feedback
18
  type: summarize_from_feedback
 
20
  split: train
21
  args: comparisons
22
  metrics:
23
+ - type: rouge
 
24
  value: 30.2401
25
+ name: Rouge1
26
+ - type: rouge
27
  value: 11.4916
28
+ name: Rouge2
29
+ - type: rouge
30
  value: 24.6485
31
+ name: RougeL
32
+ - type: rouge
33
  value: 26.1801
34
+ name: RougeLSum
35
+ - task:
36
+ type: summarization
37
+ name: Summarization
38
+ dataset:
39
+ name: cnn_dailymail
40
+ type: cnn_dailymail
41
+ config: 3.0.0
42
+ split: test
43
+ metrics:
44
+ - type: rouge
45
+ value: 23.0407
46
+ name: ROUGE-1
47
+ verified: true
48
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZmViMTI3Mzg3ZDlkMDJlMjBiODU1NTIyNmVmOWFjMzIzZDVkNTc4NWI0MGIzYTJmYmUyMDM4ZTk2Y2Q5YzVjNSIsInZlcnNpb24iOjF9.oyqMHGGnGCE1f3JUNBg9c2ThycvlecuoZVWcGXvOcm0SbenpBobLEnczlFb4qx3ySwDUsL7uVtFW7F46Lz_CAg
49
+ - type: rouge
50
+ value: 8.5384
51
+ name: ROUGE-2
52
+ verified: true
53
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiOWZmOGYzODgyN2VjNjlmMWZiYzY3ZDIwNjAyZGExNDc3ZmZmMzIwMzk2NGVlNTZiMWIwZDUxM2IwNWU4OWQ5ZSIsInZlcnNpb24iOjF9.dCSVXAMFQQASe6fpPEOJu3Cfsd6Adm1L53xF0Job6W2Qd78M91wfl0715sUiFpsEKWKN9Z9bnhGA7d-SScVSAA
54
+ - type: rouge
55
+ value: 17.6719
56
+ name: ROUGE-L
57
+ verified: true
58
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZjkwOTA5YzNmYTc2YmRhNTFmYzc4MmE1ZjI5YmE1NDJiNjliNWRmNWU3MTg3MDgxY2RjZTZkMGZkZDY0MTIzNyIsInZlcnNpb24iOjF9.8-0VXD5ZGKIAGhjvuiBAchDZxyVWKczwiBxWDIQEItT3egSjYefGN8eOo9Z7R7sToX_li7IPeajVl3PbrQgPCg
59
+ - type: rouge
60
+ value: 20.9526
61
+ name: ROUGE-LSUM
62
+ verified: true
63
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZGNkYjdiMzE4MWQxZmIyMjYzM2Y0MDU3YmFlZDg1ZmE5NTBiZTVhM2NkYTRjMmU1ZWE5MmUwMmI2NmM5ODZkMyIsInZlcnNpb24iOjF9.qjFjlfIVHew80u5t_U44n9J6_PufNyv3faHaqML_pgo3VYBrbZWHX75jnBHThueWSK2hQhhmwaxSmR4ZYndRBA
64
+ - type: loss
65
+ value: 2.6858959197998047
66
+ name: loss
67
+ verified: true
68
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiNWEwMzJiODEzY2Y0OTE3NzAyMzM1MjAxYzJjN2Y2ZGU3MGQ0MjFmZDg5ZGFlNGQ1YjJlN2I5MTFhNDE3NjZkYyIsInZlcnNpb24iOjF9.cPkbHIU3UQYMF7gUZx9Iqu-265jgv7pcgedRdVEsvxq2gfgdlyFDROQK9KI2cfk4GbQogsXEca91NdKohWFtCA
69
+ - type: gen_len
70
+ value: 18.9249
71
+ name: gen_len
72
+ verified: true
73
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiOTM3M2M0YjIzMTNmNTg2OTVmNDJhZjEwODEzNTBkODk0N2E0ZTZjNjg4MWY1OTk1ZGMzZTRmNzVkN2Y2ZDE4NyIsInZlcnNpb24iOjF9.Qo6HhXLr-j-aPKRL3ZVdMJMwTCQAUhAPUcLlN-2lGSS9tAoxVJEYr0O8SttMWbBDw3owivQdduxVre9SGUKuBQ
74
  ---
75
 
76
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You