File size: 2,822 Bytes
2c517b8 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 |
# BERT-Base-Uncased Quantized Model for Twitter Tweet Sentiment Classification
This repository hosts a quantized version of the **T5-Base** model, fine-tuned for **Movie Script Writting**. The model is optimized for efficient deployment while maintaining high accuracy, making it suitable for resource-constrained environments such as mobile and edge devices.
## Model Details
- **Model Architecture:** T5-Base
- **Task:** Movie Script Writting
- **Dataset:** bookcorpus
- **Quantization:** Float16 (FP16)
- **Fine-tuning Framework:** Hugging Face Transformers
- **Inference Framework:** PyTorch
## Usage
### Installation
```sh
pip install transformers torch
```
### Loading the Model
```python
from transformers import BertForSequenceClassification, BertTokenizer
import torch
# Load quantized model
quantized_model_path = "path/to/bert_finetuned_fp16"
def generate_script(prompt):
device = torch.device("cuda" if torch.cuda.is_available() else "cpu") # Check available device
model.to(device) # Move model to the appropriate device
inputs = tokenizer(f"Generate a movie script: {prompt}", return_tensors="pt", truncation=True, padding="max_length", max_length=256)
inputs = {key: value.to(device) for key, value in inputs.items()} # Move inputs to same device as model
with torch.no_grad():
outputs = model.generate(**inputs, max_length=256, num_return_sequences=1)
return tokenizer.decode(outputs[0], skip_special_tokens=True)
# Test the script generator
prompt = "SCENE: EXT. DARK ALLEY - NIGHT"
print(generate_script(prompt))
## Performance Metrics
- **Accuracy:** 0.82
- **Inference Speed:** Faster due to FP16 quantization
## Fine-Tuning Details
### Dataset
### Training Configuration
- **Number of epochs:** 3
- **Batch size:** 8
- **Evaluation strategy:** Per epoch
- **Learning rate:** 2e-5
- **Optimizer:** AdamW
### Quantization
The model is quantized using **Post-Training Quantization (PTQ)** with **Float16 (FP16)**, which reduces model size and improves inference efficiency while maintaining accuracy.
## Repository Structure
```
.
βββ model/ # Contains the quantized model files
βββ tokenizer_config/ # Tokenizer configuration and vocabulary files
βββ model.safensors/ # Fine-tuned and quantized model
βββ README.md # Model documentation
```
## Limitations
- The model is optimized for English-language next-word prediction tasks.
- While quantization improves speed, minor accuracy degradation may occur.
- Performance on out-of-distribution text (e.g., highly technical or domain-specific data) may be limited.
## Contributing
Contributions are welcome! Feel free to open an issue or submit a pull request if you have suggestions or improvements.
``
|