Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,96 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# π§ TextSummarizerForInventoryReport-T5
|
2 |
+
|
3 |
+
A T5-based text summarization model fine-tuned on inventory report data. This model generates concise summaries of detailed inventory-related texts, making it useful for warehouse management, stock reporting, and supply chain documentation.---
|
4 |
+
|
5 |
+
## β¨ Model Highlights
|
6 |
+
|
7 |
+
- π Based on t5-small from Hugging Face π€
|
8 |
+
- π Fine-tuned on structured inventory report data (report_text β summary_text)
|
9 |
+
- π Generates meaningful and human-readable summaries
|
10 |
+
- β‘ Supports maximum input length of 512 tokens and output length of 128 tokens
|
11 |
+
- π§ Built using Hugging Face Transformers and PyTorch
|
12 |
+
|
13 |
+
---
|
14 |
+
|
15 |
+
## π§ Intended Uses
|
16 |
+
|
17 |
+
-β
Inventory report summarization
|
18 |
+
-β
Warehouse/logistics management automation
|
19 |
+
-β
Business analytics and reporting dashboards
|
20 |
+
---
|
21 |
+
|
22 |
+
## π« Limitations
|
23 |
+
|
24 |
+
β Not optimized for very long reports (>512 tokens)
|
25 |
+
π Trained primarily on English-language technical/business reports
|
26 |
+
π§Ύ Performance may degrade on unstructured or noisy input text
|
27 |
+
π€ Not designed for creative or narrative summarization
|
28 |
+
|
29 |
+
|
30 |
+
## ποΈββοΈ Training Details
|
31 |
+
|
32 |
+
| Attribute | Value |
|
33 |
+
|-------------------|----------------------------------------|
|
34 |
+
| Base Model | t5-small |
|
35 |
+
| Dataset | Custom inventory reports |
|
36 |
+
| Max Input Tokens | 512 |
|
37 |
+
| Max Output Tokens | 128 |
|
38 |
+
| Epochs | 3 |
|
39 |
+
| Batch Size | 2 |
|
40 |
+
| Optimizer | AdamW |
|
41 |
+
| Loss Function |CrossEntropyLosS(with -100 padding mask)|
|
42 |
+
| Framework | PyTorch + Hugging Face Transformers |
|
43 |
+
| Hardware | CUDA-enabled GPU |
|
44 |
+
|
45 |
+
|
46 |
+
---
|
47 |
+
|
48 |
+
## π Usage
|
49 |
+
|
50 |
+
```python
|
51 |
+
|
52 |
+
from transformers import T5Tokenizer, T5ForConditionalGeneration, Trainer, TrainingArguments
|
53 |
+
from datasets import Dataset
|
54 |
+
import torch
|
55 |
+
import torch.nn.functional as F
|
56 |
+
|
57 |
+
model_name = "AventIQ-AI/Text_Summarization_For_inventory_Report"
|
58 |
+
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
59 |
+
model = AutoModelForSequenceClassification.from_pretrained(model_name)
|
60 |
+
model.eval()
|
61 |
+
|
62 |
+
def preprocess(example):
|
63 |
+
input_text = "summarize: " + example["full_text"]
|
64 |
+
input_enc = tokenizer(input_text, truncation=True, padding="max_length", max_length=512)
|
65 |
+
target_enc = tokenizer(example["summary"], truncation=True, padding="max_length", max_length=64)
|
66 |
+
input_enc["labels"] = target_enc["input_ids"]
|
67 |
+
return input_enc
|
68 |
+
|
69 |
+
# Generate summary
|
70 |
+
summary = summarize(long_text, model, tokenizer)
|
71 |
+
print("Summary:", summary)
|
72 |
+
|
73 |
+
|
74 |
+
|
75 |
+
|
76 |
+
π Repository Structure
|
77 |
+
python
|
78 |
+
Copy
|
79 |
+
Edit
|
80 |
+
.
|
81 |
+
βββ model/ # Contains fine-tuned model files
|
82 |
+
βββ tokenizer/ # Tokenizer config and vocab
|
83 |
+
βββ config.json # Model configuration
|
84 |
+
βββ pytorch_model.bin # Fine-tuned model weights
|
85 |
+
βββ README.md # Model card
|
86 |
+
|
87 |
+
|
88 |
+
|
89 |
+
π€ Contributing
|
90 |
+
Contributions are welcome!
|
91 |
+
Feel free to open an issue or submit a pull request if you have suggestions, improvements, or want to adapt the model to new domains.
|
92 |
+
|
93 |
+
|
94 |
+
|
95 |
+
|
96 |
+
|