ambrosfitz's picture
Update README.md
93f197e verified
metadata
license: apache-2.0
datasets:
  - ambrosfitz/cnn-daily-grammar
language:
  - en
base_model:
  - google-t5/t5-base
pipeline_tag: summarization

T5-CNN-Grammar-Enhanced

Model Description

A T5-base model fine-tuned on the CNN Daily Grammar dataset for enhanced summarization with grammatical structure awareness.

Usage

from transformers import AutoTokenizer, AutoModelForSeq2SeqGeneration

tokenizer = AutoTokenizer.from_pretrained("username/t5-cnn-grammar-enhanced")
model = AutoModelForSeq2SeqGeneration.from_pretrained("username/t5-cnn-grammar-enhanced")

Training Details

  • Base model: t5-base
  • Dataset: CNN Daily Grammar
  • Training type: Fine-tuning
  • Framework: PyTorch
  • Epochs: 10
  • Batch size: 8
  • Learning rate: 2e-5
  • Loss: Focal Loss
  • Scheduler: Linear warmup
  • Best validation loss: 0.7759

Model Architecture

  • Encoder-decoder transformer
  • Grammar-enhanced input structure
  • Focal loss for detail retention

Evaluation Results

Final validation metrics:

  • Loss: 0.7759
  • Strong performance on detail retention and factual accuracy

Limitations

  • Limited to news article summarization
  • May omit specific numerical details
  • Best suited for formal news content

License

Apache 2.0