File size: 4,597 Bytes

fddbd3a
 
 
 
 
 
 
 
e37802e
fddbd3a
 
 
 
 
e37802e
fddbd3a
 
 
 
 
 
 
e37802e
fddbd3a
e37802e
 
fddbd3a
e37802e
 
fddbd3a
e37802e
 
fddbd3a
e37802e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
fddbd3a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9d616e8
fddbd3a
 
06a006a
 
 
 
 
 
 
 
 
 
 
 
 
fddbd3a

---
license: apache-2.0
tags:
- generated_from_trainer
datasets:
- summarize_from_feedback
metrics:
- rouge
pipeline_tag: summarization
model-index:
- name: flan-t5-large-finetuned-openai-summarize_from_feedback
  results:
  - task:
      type: text2text-generation
      name: Sequence-to-sequence Language Modeling
    dataset:
      name: summarize_from_feedback
      type: summarize_from_feedback
      config: comparisons
      split: train
      args: comparisons
    metrics:
    - type: rouge
      value: 30.2401
      name: Rouge1
    - type: rouge
      value: 11.4916
      name: Rouge2
    - type: rouge
      value: 24.6485
      name: RougeL
    - type: rouge
      value: 26.1801
      name: RougeLSum
  - task:
      type: summarization
      name: Summarization
    dataset:
      name: cnn_dailymail
      type: cnn_dailymail
      config: 3.0.0
      split: test
    metrics:
    - type: rouge
      value: 23.0407
      name: ROUGE-1
      verified: true
      verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZmViMTI3Mzg3ZDlkMDJlMjBiODU1NTIyNmVmOWFjMzIzZDVkNTc4NWI0MGIzYTJmYmUyMDM4ZTk2Y2Q5YzVjNSIsInZlcnNpb24iOjF9.oyqMHGGnGCE1f3JUNBg9c2ThycvlecuoZVWcGXvOcm0SbenpBobLEnczlFb4qx3ySwDUsL7uVtFW7F46Lz_CAg
    - type: rouge
      value: 8.5384
      name: ROUGE-2
      verified: true
      verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiOWZmOGYzODgyN2VjNjlmMWZiYzY3ZDIwNjAyZGExNDc3ZmZmMzIwMzk2NGVlNTZiMWIwZDUxM2IwNWU4OWQ5ZSIsInZlcnNpb24iOjF9.dCSVXAMFQQASe6fpPEOJu3Cfsd6Adm1L53xF0Job6W2Qd78M91wfl0715sUiFpsEKWKN9Z9bnhGA7d-SScVSAA
    - type: rouge
      value: 17.6719
      name: ROUGE-L
      verified: true
      verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZjkwOTA5YzNmYTc2YmRhNTFmYzc4MmE1ZjI5YmE1NDJiNjliNWRmNWU3MTg3MDgxY2RjZTZkMGZkZDY0MTIzNyIsInZlcnNpb24iOjF9.8-0VXD5ZGKIAGhjvuiBAchDZxyVWKczwiBxWDIQEItT3egSjYefGN8eOo9Z7R7sToX_li7IPeajVl3PbrQgPCg
    - type: rouge
      value: 20.9526
      name: ROUGE-LSUM
      verified: true
      verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZGNkYjdiMzE4MWQxZmIyMjYzM2Y0MDU3YmFlZDg1ZmE5NTBiZTVhM2NkYTRjMmU1ZWE5MmUwMmI2NmM5ODZkMyIsInZlcnNpb24iOjF9.qjFjlfIVHew80u5t_U44n9J6_PufNyv3faHaqML_pgo3VYBrbZWHX75jnBHThueWSK2hQhhmwaxSmR4ZYndRBA
    - type: loss
      value: 2.6858959197998047
      name: loss
      verified: true
      verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiNWEwMzJiODEzY2Y0OTE3NzAyMzM1MjAxYzJjN2Y2ZGU3MGQ0MjFmZDg5ZGFlNGQ1YjJlN2I5MTFhNDE3NjZkYyIsInZlcnNpb24iOjF9.cPkbHIU3UQYMF7gUZx9Iqu-265jgv7pcgedRdVEsvxq2gfgdlyFDROQK9KI2cfk4GbQogsXEca91NdKohWFtCA
    - type: gen_len
      value: 18.9249
      name: gen_len
      verified: true
      verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiOTM3M2M0YjIzMTNmNTg2OTVmNDJhZjEwODEzNTBkODk0N2E0ZTZjNjg4MWY1OTk1ZGMzZTRmNzVkN2Y2ZDE4NyIsInZlcnNpb24iOjF9.Qo6HhXLr-j-aPKRL3ZVdMJMwTCQAUhAPUcLlN-2lGSS9tAoxVJEYr0O8SttMWbBDw3owivQdduxVre9SGUKuBQ
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# flan-t5-large-finetuned-openai-summarize_from_feedback

This model is a fine-tuned version of [google/flan-t5-large](https://huggingface.co/google/flan-t5-large) on the summarize_from_feedback dataset.
It achieves the following results on the evaluation set:
- Loss: 2.3118
- Rouge1: 30.2401
- Rouge2: 11.4916
- Rougel: 24.6485
- Rougelsum: 26.1801
- Gen Len: 18.8428

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 16
- eval_batch_size: 32
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 6

### Training results

See [Tensorboard](https://huggingface.co/mrm8488/flan-t5-large-finetuned-openai-summarize_from_feedback/tensorboard)


### Citation

```
@misc {manuel_romero_2023,
	author       = { {Manuel Romero} },
	title        = { flan-t5-large-finetuned-openai-summarize_from_feedback (Revision 51666f9) },
	year         = 2023,
	url          = { https://huggingface.co/mrm8488/flan-t5-large-finetuned-openai-summarize_from_feedback },
	doi          = { 10.57967/hf/0266 },
	publisher    = { Hugging Face }
}
```

### Framework versions

- Transformers 4.25.1
- Pytorch 1.13.0+cu116
- Datasets 2.8.0
- Tokenizers 0.13.2