|
--- |
|
language: en |
|
tags: |
|
- t5 |
|
- product-classification |
|
- category-prediction |
|
license: mit |
|
--- |
|
|
|
# T5 Product Category & Subcategory Classifier |
|
|
|
This model is fine-tuned on T5-base for product category and subcategory classification. |
|
|
|
## Model Description |
|
|
|
- **Model Type:** T5 (Text-to-Text Transfer Transformer) |
|
- **Language:** English |
|
- **Task:** Product Classification |
|
- **Training Data:** 10,172 categorized products |
|
- **Input Format:** "Predict the product category and subcategory in the following format: 'Category: <CATEGORY> | Subcategory: <SUBCATEGORY>'. Product: {product_name}" |
|
- **Output Format:** "Category: {category} | Subcategory: {subcategory}" |
|
|
|
## Usage |
|
|
|
```python |
|
from transformers import T5ForConditionalGeneration, T5Tokenizer |
|
|
|
model = T5ForConditionalGeneration.from_pretrained("{repo_id}") |
|
tokenizer = T5Tokenizer.from_pretrained("{repo_id}") |
|
|
|
def predict(text): |
|
prompt = f"Predict the product category and subcategory in the following format: 'Category: <CATEGORY> | Subcategory: <SUBCATEGORY>'. Product: {text}" |
|
inputs = tokenizer(prompt, return_tensors="pt", max_length=128, truncation=True) |
|
|
|
outputs = model.generate(**inputs, max_length=32, num_beams=4) |
|
return tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
|
|
# Example |
|
result = predict("Pantene Suave & Liso Shampoo") |
|
print(result) |
|
``` |
|
|
|
## Training Details |
|
|
|
- **Base Model:** t5-base |
|
- **Training Type:** Fine-tuning |
|
- **Epochs:** 5 |
|
- **Batch Size:** 8 |
|
- **Learning Rate:** 3e-5 |
|
- **Weight Decay:** 0.01 |
|
|
|
## Limitations |
|
|
|
- The model works best with product names in English |
|
- Performance may vary for products outside the training categories |
|
- Requires clear and specific product descriptions |
|
|