|
--- |
|
license: apache-2.0 |
|
datasets: |
|
- ambrosfitz/cnn-daily-grammar |
|
language: |
|
- en |
|
base_model: |
|
- google-t5/t5-base |
|
pipeline_tag: summarization |
|
--- |
|
# T5-CNN-Grammar-Enhanced |
|
|
|
## Model Description |
|
A T5-base model fine-tuned on the CNN Daily Grammar dataset for enhanced summarization with grammatical structure awareness. |
|
|
|
## Usage |
|
```python |
|
from transformers import AutoTokenizer, AutoModelForSeq2SeqGeneration |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("username/t5-cnn-grammar-enhanced") |
|
model = AutoModelForSeq2SeqGeneration.from_pretrained("username/t5-cnn-grammar-enhanced") |
|
``` |
|
|
|
## Training Details |
|
- Base model: t5-base |
|
- Dataset: CNN Daily Grammar |
|
- Training type: Fine-tuning |
|
- Framework: PyTorch |
|
- Epochs: 10 |
|
- Batch size: 8 |
|
- Learning rate: 2e-5 |
|
- Loss: Focal Loss |
|
- Scheduler: Linear warmup |
|
- Best validation loss: 0.7759 |
|
|
|
## Model Architecture |
|
- Encoder-decoder transformer |
|
- Grammar-enhanced input structure |
|
- Focal loss for detail retention |
|
|
|
## Evaluation Results |
|
Final validation metrics: |
|
- Loss: 0.7759 |
|
- Strong performance on detail retention and factual accuracy |
|
|
|
## Limitations |
|
- Limited to news article summarization |
|
- May omit specific numerical details |
|
- Best suited for formal news content |
|
|
|
## License |
|
Apache 2.0 |