scenario-NON-KD-SCR-COPY-CDF-ALL-D2_data-cardiffnlp_tweet_sentiment_multilingual

This model is a fine-tuned version of microsoft/mdeberta-v3-base on the tweet_sentiment_multilingual dataset. It achieves the following results on the evaluation set:

Loss: 6.8615
Accuracy: 0.4904
F1: 0.4901

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 32
eval_batch_size: 32
seed: 66
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1
1.0475	1.0870	500	1.0371	0.4985	0.4949
0.7462	2.1739	1000	1.2759	0.5123	0.5122
0.421	3.2609	1500	1.6791	0.5139	0.5126
0.2321	4.3478	2000	2.1227	0.4946	0.4940
0.1534	5.4348	2500	2.4070	0.4958	0.4966
0.0987	6.5217	3000	2.8761	0.4904	0.4900
0.0734	7.6087	3500	2.8613	0.4911	0.4881
0.0697	8.6957	4000	3.5593	0.4969	0.4932
0.0586	9.7826	4500	3.4005	0.4900	0.4883
0.0462	10.8696	5000	3.6698	0.4861	0.4866
0.0321	11.9565	5500	4.1118	0.4877	0.4883
0.0267	13.0435	6000	4.1028	0.4965	0.4959
0.0257	14.1304	6500	4.3167	0.4842	0.4815
0.0185	15.2174	7000	4.3273	0.4923	0.4876
0.0178	16.3043	7500	4.7543	0.4958	0.4959
0.0149	17.3913	8000	4.3035	0.4927	0.4929
0.0125	18.4783	8500	4.5842	0.4904	0.4884
0.0116	19.5652	9000	5.3172	0.4853	0.4833
0.0114	20.6522	9500	4.8280	0.4857	0.4825
0.0036	21.7391	10000	5.6275	0.4850	0.4820
0.0094	22.8261	10500	5.1559	0.4842	0.4815
0.0054	23.9130	11000	5.3889	0.4846	0.4826
0.0085	25.0	11500	4.8587	0.4888	0.4861
0.0068	26.0870	12000	5.3553	0.4896	0.4881
0.0054	27.1739	12500	5.3446	0.4853	0.4845
0.0042	28.2609	13000	5.3437	0.4838	0.4832
0.003	29.3478	13500	5.9054	0.4796	0.4784
0.0032	30.4348	14000	5.7871	0.4884	0.4881
0.0038	31.5217	14500	5.9122	0.4803	0.4787
0.0041	32.6087	15000	5.4601	0.4834	0.4786
0.0025	33.6957	15500	5.1979	0.4884	0.4853
0.0018	34.7826	16000	5.5286	0.4896	0.4869
0.0006	35.8696	16500	5.7718	0.4877	0.4859
0.0015	36.9565	17000	6.0193	0.4834	0.4832
0.0003	38.0435	17500	6.2210	0.4838	0.4828
0.0004	39.1304	18000	6.3234	0.4880	0.4879
0.0002	40.2174	18500	6.3829	0.4888	0.4885
0.0001	41.3043	19000	6.5514	0.4892	0.4889
0.0001	42.3913	19500	6.6261	0.4892	0.4891
0.0003	43.4783	20000	6.6971	0.4861	0.4849
0.0013	44.5652	20500	6.7077	0.4865	0.4849
0.0001	45.6522	21000	6.7350	0.4911	0.4903
0.0001	46.7391	21500	6.7889	0.4896	0.4888
0.0002	47.8261	22000	6.8318	0.4900	0.4902
0.0006	48.9130	22500	6.8526	0.4904	0.4901
0.0001	50.0	23000	6.8615	0.4904	0.4901

Framework versions

Transformers 4.44.2
Pytorch 2.1.1+cu121
Datasets 2.14.5
Tokenizers 0.19.1

haryoaw
/

scenario-NON-KD-SCR-COPY-CDF-ALL-D2_data-cardiffnlp_tweet_sentiment_multilingual

scenario-NON-KD-SCR-COPY-CDF-ALL-D2_data-cardiffnlp_tweet_sentiment_multilingual

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for haryoaw/scenario-NON-KD-SCR-COPY-CDF-ALL-D2_data-cardiffnlp_tweet_sentiment_multilingual

Evaluation results