|
--- |
|
datasets: |
|
- chillies/course-review-multilabel-sentiment-analysis |
|
language: |
|
- en |
|
metrics: |
|
- accuracy |
|
- f1 |
|
library_name: transformers |
|
--- |
|
|
|
# distilbert-course-review-classification |
|
|
|
[](https://huggingface.co/username/distilbert-course-review-classification) |
|
|
|
## Description |
|
|
|
**distilbert-course-review-classification** is a fine-tuned version of DistilBERT, specifically trained for sentiment analysis of online course reviews. This model categorizes reviews into the following classes: |
|
- Improvement Suggestions |
|
- Questions |
|
- Confusion |
|
- Support Request |
|
- Discussion |
|
- Course Comparison |
|
- Related Course Suggestions |
|
- Negative |
|
- Positive |
|
|
|
## Installation |
|
|
|
To use this model, you will need to install the following dependencies: |
|
|
|
```bash |
|
pip install transformers |
|
pip install torch # or tensorflow depending on your preference |
|
``` |
|
|
|
## Usage |
|
|
|
Here is how you can load and use the model in your code: |
|
|
|
```python |
|
from transformers import AutoTokenizer, AutoModelForSequenceClassification |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("username/distilbert-course-review-classification") |
|
model = AutoModelForSequenceClassification.from_pretrained("username/distilbert-course-review-classification") |
|
|
|
# Example usage |
|
review = "The course content is great, but I would like more examples." |
|
|
|
inputs = tokenizer(review, return_tensors="pt", padding=True, truncation=True) |
|
outputs = model(**inputs) |
|
|
|
# Assuming the model outputs logits |
|
predicted_class = outputs.logits.argmax(dim=-1).item() |
|
|
|
class_labels = [ |
|
'Improvement Suggestions', 'Questions', 'Confusion', 'Support Request', |
|
'Discussion', 'Course Comparison', 'Related Course Suggestions', |
|
'Negative', 'Positive' |
|
] |
|
|
|
print(f"Predicted class: {class_labels[predicted_class]}") |
|
``` |
|
|
|
### Inference |
|
|
|
Provide example code for performing inference with your model: |
|
|
|
```python |
|
# Example inference |
|
review = "I found the course material very confusing and hard to follow." |
|
|
|
inputs = tokenizer(review, return_tensors="pt", padding=True, truncation=True) |
|
outputs = model(**inputs) |
|
|
|
# Assuming the model outputs logits |
|
predicted_class = outputs.logits.argmax(dim=-1).item() |
|
|
|
class_labels = [ |
|
'Improvement Suggestions', 'Questions', 'Confusion', 'Support Request', |
|
'Discussion', 'Course Comparison', 'Related Course Suggestions', |
|
'Negative', 'Positive' |
|
] |
|
|
|
print(f"Predicted class: {class_labels[predicted_class]}") |
|
``` |
|
|
|
### Training |
|
|
|
If your model can be trained further, provide instructions for training: |
|
|
|
```python |
|
# Example training code |
|
from transformers import Trainer, TrainingArguments |
|
|
|
training_args = TrainingArguments( |
|
output_dir="./results", |
|
evaluation_strategy="epoch", |
|
per_device_train_batch_size=8, |
|
per_device_eval_batch_size=8, |
|
num_train_epochs=3, |
|
weight_decay=0.01, |
|
) |
|
|
|
trainer = Trainer( |
|
model=model, |
|
args=training_args, |
|
train_dataset=train_dataset, |
|
eval_dataset=eval_dataset, |
|
) |
|
|
|
trainer.train() |
|
``` |
|
|
|
## Training Details |
|
|
|
### Training Data |
|
|
|
The model was fine-tuned on a dataset of online course reviews, labeled with the following sentiment categories: |
|
- Improvement Suggestions |
|
- Questions |
|
- Confusion |
|
- Support Request |
|
- Discussion |
|
- Course Comparison |
|
- Related Course Suggestions |
|
- Negative |
|
- Positive |
|
|
|
### Training Procedure |
|
|
|
The model was fine-tuned using a standard training approach, optimizing for accurate sentiment classification. Training was conducted on [describe hardware, e.g., GPUs, TPUs] over [number of epochs] epochs with [any relevant hyperparameters]. |
|
|
|
## Evaluation |
|
|
|
### Metrics |
|
|
|
The model was evaluated using the following metrics: |
|
|
|
- **Accuracy**: X% |
|
- **Precision**: Y% |
|
- **Recall**: Z% |
|
- **F1 Score**: W% |
|
|
|
### Comparison |
|
|
|
The performance of distilbert-course-review-classification was benchmarked against other sentiment analysis models, demonstrating superior accuracy and relevance in classifying online course reviews. |
|
|
|
## Limitations and Biases |
|
|
|
While distilbert-course-review-classification is highly effective, it may have limitations in the following areas: |
|
- It may not fully understand the context of complex reviews. |
|
- There may be biases present in the training data that could affect the classification results. |
|
|
|
## How to Contribute |
|
|
|
We welcome contributions! Please see our [contributing guidelines](link_to_contributing_guidelines) for more information on how to contribute to this project. |
|
|
|
## License |
|
|
|
This model is licensed under the [MIT License](LICENSE). |
|
|
|
## Acknowledgements |
|
|
|
We would like to thank the contributors and the creators of the datasets used for training this model. |
|
``` |
|
|
|
### Tips for Completing the Template |
|
|
|
1. **Replace placeholders** (like `username`, `training data`, `evaluation metrics`) with your actual data. |
|
2. **Include any additional information** specific to your model or training process. |
|
3. **Keep the document updated** as the model evolves or more information becomes available. |