|
# π§ TextSummarizerForInventoryReport-T5 |
|
|
|
A T5-based text summarization model fine-tuned on inventory report data. This model generates concise summaries of detailed inventory-related texts, making it useful for warehouse management, stock reporting, and supply chain documentation. |
|
|
|
## β¨ Model Highlights |
|
|
|
- π Based on t5-small from Hugging Face π€ |
|
- π Fine-tuned on structured inventory report data (report_text β summary_text) |
|
- π Generates meaningful and human-readable summaries |
|
- β‘ Supports maximum input length of 512 tokens and output length of 128 tokens |
|
- π§ Built using Hugging Face Transformers and PyTorch |
|
|
|
--- |
|
|
|
## π§ Intended Uses |
|
|
|
- β
Inventory report summarization |
|
- β
Warehouse/logistics management automation |
|
- β
Business analytics and reporting dashboards |
|
|
|
## π« Limitations |
|
|
|
- β Not optimized for very long reports (>512 tokens) |
|
- π Trained primarily on English-language technical/business reports |
|
- π§Ύ Performance may degrade on unstructured or noisy input text |
|
- π€ Not designed for creative or narrative summarization |
|
|
|
|
|
|
|
## ποΈββοΈ Training Details |
|
|
|
| Attribute | Value | |
|
|-------------------|----------------------------------------| |
|
| Base Model | t5-small | |
|
| Dataset | Custom inventory reports | |
|
| Max Input Tokens | 512 | |
|
| Max Output Tokens | 128 | |
|
| Epochs | 3 | |
|
| Batch Size | 2 | |
|
| Optimizer | AdamW | |
|
| Loss Function |CrossEntropyLosS(with -100 padding mask)| |
|
| Framework | PyTorch + Hugging Face Transformers | |
|
| Hardware | CUDA-enabled GPU | |
|
|
|
|
|
--- |
|
|
|
## π Usage |
|
|
|
```python |
|
|
|
from transformers import T5Tokenizer, T5ForConditionalGeneration, Trainer, TrainingArguments |
|
from datasets import Dataset |
|
import torch |
|
import torch.nn.functional as F |
|
|
|
model_name = "AventIQ-AI/Text_Summarization_For_inventory_Report" |
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
model = AutoModelForSequenceClassification.from_pretrained(model_name) |
|
model.eval() |
|
|
|
def preprocess(example): |
|
input_text = "summarize: " + example["full_text"] |
|
input_enc = tokenizer(input_text, truncation=True, padding="max_length", max_length=512) |
|
target_enc = tokenizer(example["summary"], truncation=True, padding="max_length", max_length=64) |
|
input_enc["labels"] = target_enc["input_ids"] |
|
return input_enc |
|
|
|
# Generate summary |
|
summary = summarize(long_text, model, tokenizer) |
|
print("Summary:", summary) |
|
|
|
``` |
|
|
|
|
|
## Repository Structure |
|
|
|
``` |
|
. |
|
βββ model/ # Contains the quantized model files |
|
βββ tokenizer_config/ # Tokenizer configuration and vocabulary files |
|
βββ model.safensors/ # Fine Tuned Model |
|
βββ README.md # Model documentation |
|
|
|
``` |
|
|
|
|
|
π€ Contributing |
|
Contributions are welcome! |
|
Feel free to open an issue or submit a pull request if you have suggestions, improvements, or want to adapt the model to new domains. |
|
|
|
|
|
|
|
|
|
|
|
|