Add evaluation results on the 3.0.0 config and test split of cnn_dailymail

Beep boop, I am a bot from Hugging Face's automatic model evaluator 👋!\
Your model has been evaluated on the 3.0.0 config and test split of the [cnn_dailymail](https://huggingface.co/datasets/cnn_dailymail) dataset by

@jayeeap
, using the predictions stored [here](https://huggingface.co/datasets/autoevaluate/autoeval-eval-cnn_dailymail-3.0.0-73237a-43943145136).\
Accept this pull request to see the results displayed on the [Hub leaderboard](https://huggingface.co/spaces/autoevaluate/leaderboards?dataset=cnn_dailymail).\
Evaluate your model on more datasets [here](https://huggingface.co/spaces/autoevaluate/model-evaluator?dataset=cnn_dailymail).

Files changed (1) hide show

README.md +49 -10

README.md CHANGED Viewed

@@ -6,12 +6,13 @@ datasets:
 - summarize_from_feedback
 metrics:
 - rouge
 model-index:
 - name: flan-t5-large-finetuned-openai-summarize_from_feedback
   results:
   - task:
-      name: Sequence-to-sequence Language Modeling
       type: text2text-generation
     dataset:
       name: summarize_from_feedback
       type: summarize_from_feedback
@@ -19,19 +20,57 @@ model-index:
       split: train
       args: comparisons
     metrics:
-    - name: Rouge1
-      type: rouge
       value: 30.2401
-    - name: Rouge2
-      type: rouge
       value: 11.4916
-    - name: RougeL
-      type: rouge
       value: 24.6485
-    - name: RougeLSum
-      type: rouge
       value: 26.1801
-pipeline_tag: summarization
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You

 - summarize_from_feedback
 metrics:
 - rouge
+pipeline_tag: summarization
 model-index:
 - name: flan-t5-large-finetuned-openai-summarize_from_feedback
   results:
   - task:
       type: text2text-generation
+      name: Sequence-to-sequence Language Modeling
     dataset:
       name: summarize_from_feedback
       type: summarize_from_feedback
       split: train
       args: comparisons
     metrics:
+    - type: rouge
       value: 30.2401
+      name: Rouge1
+    - type: rouge
       value: 11.4916
+      name: Rouge2
+    - type: rouge
       value: 24.6485
+      name: RougeL
+    - type: rouge
       value: 26.1801
+      name: RougeLSum
+  - task:
+      type: summarization
+      name: Summarization
+    dataset:
+      name: cnn_dailymail
+      type: cnn_dailymail
+      config: 3.0.0
+      split: test
+    metrics:
+    - type: rouge
+      value: 23.0407
+      name: ROUGE-1
+      verified: true
+      verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZmViMTI3Mzg3ZDlkMDJlMjBiODU1NTIyNmVmOWFjMzIzZDVkNTc4NWI0MGIzYTJmYmUyMDM4ZTk2Y2Q5YzVjNSIsInZlcnNpb24iOjF9.oyqMHGGnGCE1f3JUNBg9c2ThycvlecuoZVWcGXvOcm0SbenpBobLEnczlFb4qx3ySwDUsL7uVtFW7F46Lz_CAg
+    - type: rouge
+      value: 8.5384
+      name: ROUGE-2
+      verified: true
+      verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiOWZmOGYzODgyN2VjNjlmMWZiYzY3ZDIwNjAyZGExNDc3ZmZmMzIwMzk2NGVlNTZiMWIwZDUxM2IwNWU4OWQ5ZSIsInZlcnNpb24iOjF9.dCSVXAMFQQASe6fpPEOJu3Cfsd6Adm1L53xF0Job6W2Qd78M91wfl0715sUiFpsEKWKN9Z9bnhGA7d-SScVSAA
+    - type: rouge
+      value: 17.6719
+      name: ROUGE-L
+      verified: true
+      verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZjkwOTA5YzNmYTc2YmRhNTFmYzc4MmE1ZjI5YmE1NDJiNjliNWRmNWU3MTg3MDgxY2RjZTZkMGZkZDY0MTIzNyIsInZlcnNpb24iOjF9.8-0VXD5ZGKIAGhjvuiBAchDZxyVWKczwiBxWDIQEItT3egSjYefGN8eOo9Z7R7sToX_li7IPeajVl3PbrQgPCg
+    - type: rouge
+      value: 20.9526
+      name: ROUGE-LSUM
+      verified: true
+      verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZGNkYjdiMzE4MWQxZmIyMjYzM2Y0MDU3YmFlZDg1ZmE5NTBiZTVhM2NkYTRjMmU1ZWE5MmUwMmI2NmM5ODZkMyIsInZlcnNpb24iOjF9.qjFjlfIVHew80u5t_U44n9J6_PufNyv3faHaqML_pgo3VYBrbZWHX75jnBHThueWSK2hQhhmwaxSmR4ZYndRBA
+    - type: loss
+      value: 2.6858959197998047
+      name: loss
+      verified: true
+      verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiNWEwMzJiODEzY2Y0OTE3NzAyMzM1MjAxYzJjN2Y2ZGU3MGQ0MjFmZDg5ZGFlNGQ1YjJlN2I5MTFhNDE3NjZkYyIsInZlcnNpb24iOjF9.cPkbHIU3UQYMF7gUZx9Iqu-265jgv7pcgedRdVEsvxq2gfgdlyFDROQK9KI2cfk4GbQogsXEca91NdKohWFtCA
+    - type: gen_len
+      value: 18.9249
+      name: gen_len
+      verified: true
+      verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiOTM3M2M0YjIzMTNmNTg2OTVmNDJhZjEwODEzNTBkODk0N2E0ZTZjNjg4MWY1OTk1ZGMzZTRmNzVkN2Y2ZDE4NyIsInZlcnNpb24iOjF9.Qo6HhXLr-j-aPKRL3ZVdMJMwTCQAUhAPUcLlN-2lGSS9tAoxVJEYr0O8SttMWbBDw3owivQdduxVre9SGUKuBQ
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You