T5-CNN-Grammar-Enhanced

Model Description

A T5-base model fine-tuned on the CNN Daily Grammar dataset for enhanced summarization with grammatical structure awareness.

Usage

from transformers import AutoTokenizer, AutoModelForSeq2SeqGeneration

tokenizer = AutoTokenizer.from_pretrained("username/t5-cnn-grammar-enhanced")
model = AutoModelForSeq2SeqGeneration.from_pretrained("username/t5-cnn-grammar-enhanced")

Training Details

  • Base model: t5-base
  • Dataset: CNN Daily Grammar
  • Training type: Fine-tuning
  • Framework: PyTorch
  • Epochs: 10
  • Batch size: 8
  • Learning rate: 2e-5
  • Loss: Focal Loss
  • Scheduler: Linear warmup
  • Best validation loss: 0.7759

Model Architecture

  • Encoder-decoder transformer
  • Grammar-enhanced input structure
  • Focal loss for detail retention

Evaluation Results

Final validation metrics:

  • Loss: 0.7759
  • Strong performance on detail retention and factual accuracy

Limitations

  • Limited to news article summarization
  • May omit specific numerical details
  • Best suited for formal news content

License

Apache 2.0

Downloads last month
17
Safetensors
Model size
198M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model's library.

Model tree for ambrosfitz/t5-cnn-grammar-enhanced

Base model

google-t5/t5-base
Finetuned
(453)
this model

Dataset used to train ambrosfitz/t5-cnn-grammar-enhanced