File size: 3,206 Bytes
ebd8974
 
3ab0688
ebd8974
 
 
 
 
 
 
 
 
 
 
 
 
705abc8
 
 
ebd8974
 
 
705abc8
 
 
 
 
ebd8974
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9d186dc
ebd8974
 
9d186dc
ebd8974
9d186dc
ebd8974
9d186dc
 
 
 
 
 
ebd8974
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
# 🧠 TextSummarizerForInventoryReport-T5

A T5-based text summarization model fine-tuned on inventory report data. This model generates concise summaries of detailed inventory-related texts, making it useful for warehouse management, stock reporting, and supply chain documentation.

## ✨ Model Highlights

- πŸ“Œ Based on t5-small from Hugging Face πŸ€—
- πŸ” Fine-tuned on structured inventory report data (report_text β†’ summary_text)
- πŸ“‹ Generates meaningful and human-readable summaries
- ⚑ Supports maximum input length of 512 tokens and output length of 128 tokens
- 🧠 Built using Hugging Face Transformers and PyTorch

---

## 🧠 Intended Uses

- βœ… Inventory report summarization
- βœ… Warehouse/logistics management automation
- βœ… Business analytics and reporting dashboards

## 🚫 Limitations

- ❌ Not optimized for very long reports (>512 tokens)
- 🌍 Trained primarily on English-language technical/business reports
- 🧾 Performance may degrade on unstructured or noisy input text
- πŸ€” Not designed for creative or narrative summarization



## πŸ‹οΈβ€β™‚οΈ Training Details

| Attribute         | Value                                 |
|-------------------|----------------------------------------|
| Base Model        | t5-small                               |
| Dataset           | Custom inventory reports               |
| Max Input Tokens  | 512                                    |
| Max Output Tokens | 128                                    |
| Epochs            | 3                                      |
| Batch Size        | 2                                      |
| Optimizer         | AdamW                                  |
| Loss Function     |CrossEntropyLosS(with -100 padding mask)|
| Framework         | PyTorch + Hugging Face Transformers    |
| Hardware          | CUDA-enabled GPU                       |


---

## πŸš€ Usage

```python

from transformers import T5Tokenizer, T5ForConditionalGeneration, Trainer, TrainingArguments
from datasets import Dataset
import torch
import torch.nn.functional as F

model_name = "AventIQ-AI/Text_Summarization_For_inventory_Report"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
model.eval()

def preprocess(example):
    input_text = "summarize: " + example["full_text"]
    input_enc = tokenizer(input_text, truncation=True, padding="max_length", max_length=512)
    target_enc = tokenizer(example["summary"], truncation=True, padding="max_length", max_length=64)
    input_enc["labels"] = target_enc["input_ids"]
    return input_enc

# Generate summary
summary = summarize(long_text, model, tokenizer)
print("Summary:", summary)

```


## Repository Structure

```
.
 β”œβ”€β”€ model/               # Contains the quantized model files
 β”œβ”€β”€ tokenizer_config/    # Tokenizer configuration and vocabulary files
 β”œβ”€β”€ model.safensors/     # Fine Tuned Model
 β”œβ”€β”€ README.md            # Model documentation
 
```


🀝 Contributing
Contributions are welcome!
Feel free to open an issue or submit a pull request if you have suggestions, improvements, or want to adapt the model to new domains.