RO-Sentiment
This model is a fine-tuned version of readerbench/RoBERT-base on the Decathlon reviews and Cinemagia reviews dataset. It achieves the following results on the evaluation set:
- Loss: 0.3923
- Accuracy: 0.8307
- Precision: 0.8366
- Recall: 0.8959
- F1: 0.8652
- F1 Weighted: 0.8287
Output labels:
- LABEL_0 = Negative Sentiment
- LABEL_1 = Positive Sentiment
Evaluation on other datasets
SENT_RO
precision | recall | f1-score | support | |
---|---|---|---|---|
Negative (0) | 0.79 | 0.83 | 0.81 | 11,675 |
Positive (1) | 0.88 | 0.85 | 0.87 | 17,271 |
Accuracy | 0.85 | 28,946 | ||
Macro Avg | 0.84 | 0.84 | 0.84 | 28,946 |
Weighted Avg | 0.85 | 0.85 | 0.85 | 28,946 |
LaRoSeDa
precision | recall | f1-score | support | |
---|---|---|---|---|
Negative (0) | 0.79 | 0.94 | 0.86 | 7,500 |
Positive (1) | 0.93 | 0.75 | 0.83 | 7,500 |
Accuracy | 0.85 | 15,000 | ||
Macro Avg | 0.86 | 0.85 | 0.84 | 15,000 |
Weighted Avg | 0.86 | 0.85 | 0.84 | 15,000 |
Model description
Finetuned Romanian BERT model for sentiment classification.
Trained on a mix of product reviews from Decathlon retailer website and movie reviews from cinemagia.
Intended uses & limitations
Sentiment classification for Romanian Language.
Biased towards Product reviews.
There is no "neutral" sentiment label.
Training and evaluation data
Trained on:
Decathlon Dataset available on request
Cinemagia Movie reviews public on kaggle Link
Evaluated on
- Holdout data from training dataset
- RO_SENT Dataset
- LaROSeDa Dataset
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 6e-05
- train_batch_size: 64
- eval_batch_size: 128
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.2
- num_epochs: 10 (Early stop epoch 3, best epoch 2)
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy | Precision | Recall | F1 | F1 Weighted |
---|---|---|---|---|---|---|---|---|
0.4198 | 1.0 | 1629 | 0.3983 | 0.8377 | 0.8791 | 0.8721 | 0.8756 | 0.8380 |
0.3861 | 2.0 | 3258 | 0.4312 | 0.8429 | 0.8963 | 0.8665 | 0.8812 | 0.8442 |
0.3189 | 3.0 | 4887 | 0.3923 | 0.8307 | 0.8366 | 0.8959 | 0.8652 | 0.8287 |
Framework versions
- Transformers 4.31.0
- Pytorch 2.0.1+cu118
- Datasets 2.14.3
- Tokenizers 0.13.3
- Downloads last month
- 41,435
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for readerbench/ro-sentiment
Base model
readerbench/RoBERT-baseEvaluation results
- Accuracy on Rommanian Sentiment Datasetself-reported0.850
- Precision on Rommanian Sentiment Datasetself-reported0.850
- Recall on Rommanian Sentiment Datasetself-reported0.850
- Weighted F1 on Rommanian Sentiment Datasetself-reported0.850
- Macro F1 on Rommanian Sentiment Datasetself-reported0.840
- Accuracy on A Large Romanian Sentiment Data Setself-reported0.850
- Precision on A Large Romanian Sentiment Data Setself-reported0.860
- Recall on A Large Romanian Sentiment Data Setself-reported0.850
- Weighted F1 on A Large Romanian Sentiment Data Setself-reported0.840
- Macro F1 on A Large Romanian Sentiment Data Setself-reported0.840