File size: 2,348 Bytes
a2f8fb6 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 |
---
language: en
license: apache-2.0
tags:
- t5
- summarization
- grammar-enhanced
datasets:
- ambrosfitz/grammar-summary
model-index:
- name: Grammar-Enhanced T5 Summarizer
results:
- task:
name: Text Summarization
type: summarization
dataset:
name: ambrosfitz/grammar-summary
type: ambrosfitz/grammar-summary
metrics:
- name: Validation Loss
type: loss
value: 0.8700
- name: Model Type
type: metric
value: T5-base
---
# Grammar-Enhanced T5 Summarizer
This model is a fine-tuned version of T5-base for text summarization with grammar-enhanced inputs. It was trained on historical text summaries with explicit grammar structure analysis.
## Model Description
- **Base Model**: T5-base
- **Task**: Text Summarization
- **Training Data**: Historical texts with grammar analysis
- **Input Format**: Structured text with grammar analysis (subjects, verbs, objects, relationships)
- **Output Format**: Concise summary
## Usage
```python
from transformers import T5ForConditionalGeneration, T5Tokenizer
# Load model and tokenizer
model = T5ForConditionalGeneration.from_pretrained("ambrosfitz/summarize-grammar")
tokenizer = T5Tokenizer.from_pretrained("ambrosfitz/summarize-grammar")
# Prepare input
text = "Your text here..."
input_text = f"summarize: {text}"
# Generate summary
inputs = tokenizer(input_text, return_tensors="pt", max_length=512, truncation=True)
outputs = model.generate(**inputs, max_length=150, num_beams=4, length_penalty=2.0)
summary = tokenizer.decode(outputs[0], skip_special_tokens=True)
```
## Training Details
The model was fine-tuned on a dataset of historical texts with additional grammar analysis information. Each input includes:
- Main subjects
- Key verbs
- Objects
- Grammatical relationships
The model achieved a validation loss of 0.8700 during training.
## Limitations
This model works best with:
- Historical texts
- Formal writing
- English language content
- Texts that benefit from structural analysis
## Citation
If you use this model, please cite:
```
@misc{grammar-t5-summarizer,
author = {repo_owner},
title = {Grammar-Enhanced T5 Summarizer},
year = {2024},
publisher = {Hugging Face},
journal = {Hugging Face Model Hub},
howpublished = {https://huggingface.co/ambrosfitz/summarize-grammar}
}
```
|