Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,102 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# BERT-Base-Uncased Quantized Model for Sentiment Analysis for Student Feedback Analysis
|
2 |
+
|
3 |
+
This repository hosts a quantized version of the BERT model, fine-tuned for stock-market-analysis-sentiment-classification tasks. The model has been optimized for efficient deployment while maintaining high accuracy, making it suitable for resource-constrained environments.
|
4 |
+
|
5 |
+
## Model Details
|
6 |
+
|
7 |
+
- **Model Architecture:** BERT Base Uncased
|
8 |
+
- **Task:** Sentiment Analysis for Student Feedback Analysis
|
9 |
+
- **Dataset:** Stanford Sentiment Treebank v2 (SST2)
|
10 |
+
- **Quantization:** Float16
|
11 |
+
- **Fine-tuning Framework:** Hugging Face Transformers
|
12 |
+
|
13 |
+
## Usage
|
14 |
+
|
15 |
+
### Installation
|
16 |
+
|
17 |
+
```sh
|
18 |
+
pip install transformers torch
|
19 |
+
```
|
20 |
+
|
21 |
+
|
22 |
+
### Loading the Model
|
23 |
+
|
24 |
+
```python
|
25 |
+
|
26 |
+
from transformers import BertForSequenceClassification, BertTokenizer
|
27 |
+
import torch
|
28 |
+
|
29 |
+
# Load quantized model
|
30 |
+
quantized_model_path = "AventIQ-AI/sentiment-analysis-for-student-feedback-analysis"
|
31 |
+
quantized_model = BertForSequenceClassification.from_pretrained(quantized_model_path)
|
32 |
+
quantized_model.eval() # Set to evaluation mode
|
33 |
+
quantized_model.half() # Convert model to FP16
|
34 |
+
|
35 |
+
# Load tokenizer
|
36 |
+
tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")
|
37 |
+
|
38 |
+
# Define a test sentence
|
39 |
+
test_sentence = "Overall, the course was a valuable learning experience. The instructor was knowledgeable and always willing to answer questions, which made complex topics easier to understand. However, the lectures sometimes felt rushed, and there was not enough time allocated for in-depth discussions. The assignments were well-designed and helped reinforce the concepts, though some of them were a bit too lengthy for the given deadlines. I appreciated the feedback provided on my submissions, as it helped me identify areas for improvement. Despite a few issues, I feel more confident in the subject now than when I started."
|
40 |
+
|
41 |
+
# Tokenize input
|
42 |
+
inputs = tokenizer(test_sentence, return_tensors="pt", padding=True, truncation=True, max_length=128)
|
43 |
+
|
44 |
+
# Ensure input tensors are in correct dtype
|
45 |
+
inputs["input_ids"] = inputs["input_ids"].long() # Convert to long type
|
46 |
+
inputs["attention_mask"] = inputs["attention_mask"].long() # Convert to long type
|
47 |
+
|
48 |
+
# Make prediction
|
49 |
+
with torch.no_grad():
|
50 |
+
outputs = quantized_model(**inputs)
|
51 |
+
|
52 |
+
# Get predicted class
|
53 |
+
predicted_class = torch.argmax(outputs.logits, dim=1).item()
|
54 |
+
print(f"Predicted Class: {predicted_class}")
|
55 |
+
|
56 |
+
|
57 |
+
label_mapping = {0: "very_negative", 1: "nagative", 2: "neutral", 3: "Positive", 4: "very_positive"} # Example
|
58 |
+
|
59 |
+
predicted_label = label_mapping[predicted_class]
|
60 |
+
print(f"Predicted Label: {predicted_label}")
|
61 |
+
|
62 |
+
```
|
63 |
+
|
64 |
+
## Performance Metrics
|
65 |
+
|
66 |
+
- **Accuracy:** 0.82
|
67 |
+
|
68 |
+
## Fine-Tuning Details
|
69 |
+
|
70 |
+
### Dataset
|
71 |
+
|
72 |
+
The dataset is taken from Kaggle Stanford Sentiment Treebank v2 (SST2).
|
73 |
+
|
74 |
+
### Training
|
75 |
+
|
76 |
+
- Number of epochs: 3
|
77 |
+
- Batch size: 8
|
78 |
+
- Evaluation strategy: epoch
|
79 |
+
- Learning rate: 2e-5
|
80 |
+
|
81 |
+
### Quantization
|
82 |
+
|
83 |
+
Post-training quantization was applied using PyTorch's built-in quantization framework to reduce the model size and improve inference efficiency.
|
84 |
+
|
85 |
+
## Repository Structure
|
86 |
+
|
87 |
+
```
|
88 |
+
.
|
89 |
+
βββ model/ # Contains the quantized model files
|
90 |
+
βββ tokenizer_config/ # Tokenizer configuration and vocabulary files
|
91 |
+
βββ model.safensors/ # Fine Tuned Model
|
92 |
+
βββ README.md # Model documentation
|
93 |
+
```
|
94 |
+
|
95 |
+
## Limitations
|
96 |
+
|
97 |
+
- The model may not generalize well to domains outside the fine-tuning dataset.
|
98 |
+
- Quantization may result in minor accuracy degradation compared to full-precision models.
|
99 |
+
|
100 |
+
## Contributing
|
101 |
+
|
102 |
+
Contributions are welcome! Feel free to open an issue or submit a pull request if you have suggestions or improvements.
|