agentlans commited on
Commit
b177cf2
1 Parent(s): c861078

Upload 13 files

Browse files
README.md CHANGED
@@ -4,58 +4,99 @@ license: mit
4
  base_model: agentlans/deberta-v3-xsmall-zyda-2
5
  tags:
6
  - generated_from_trainer
 
 
7
  model-index:
8
- - name: deberta-v3-xsmall-zyda-2-sentiment
9
  results: []
10
  ---
11
 
12
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
13
- should probably proofread and complete it, then remove this comment. -->
14
 
15
- # deberta-v3-xsmall-zyda-2-sentiment
16
 
17
- This model is a fine-tuned version of [agentlans/deberta-v3-xsmall-zyda-2](https://huggingface.co/agentlans/deberta-v3-xsmall-zyda-2) on an unknown dataset.
18
- It achieves the following results on the evaluation set:
19
- - Loss: 0.0493
20
- - Mse: 0.0493
21
 
22
- ## Model description
 
23
 
24
- More information needed
25
 
26
- ## Intended uses & limitations
27
 
28
- More information needed
29
 
30
- ## Training and evaluation data
31
 
32
- More information needed
 
 
33
 
34
- ## Training procedure
35
 
36
- ### Training hyperparameters
37
 
38
- The following hyperparameters were used during training:
39
- - learning_rate: 5e-05
40
- - train_batch_size: 64
41
- - eval_batch_size: 8
42
- - seed: 42
43
- - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
44
- - lr_scheduler_type: linear
45
- - num_epochs: 3.0
46
 
47
- ### Training results
 
 
 
 
48
 
49
- | Training Loss | Epoch | Step | Validation Loss | Mse |
50
- |:-------------:|:-----:|:----:|:---------------:|:------:|
51
- | 0.0627 | 1.0 | 3143 | 0.0665 | 0.0665 |
52
- | 0.0411 | 2.0 | 6286 | 0.0493 | 0.0493 |
53
- | 0.0321 | 3.0 | 9429 | 0.0524 | 0.0524 |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
54
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
55
 
56
- ### Framework versions
57
 
58
- - Transformers 4.46.3
59
- - Pytorch 2.5.1+cu124
60
- - Datasets 3.1.0
61
- - Tokenizers 0.20.3
 
4
  base_model: agentlans/deberta-v3-xsmall-zyda-2
5
  tags:
6
  - generated_from_trainer
7
+ - sentiment-analysis
8
+ - twitter-sentiment
9
  model-index:
10
+ - name: deberta-v3-xsmall-zyda-2-transformed-sentiment-new
11
  results: []
12
  ---
13
 
14
+ # DeBERTa-v3-XSmall Sentiment Analysis Model
 
15
 
16
+ ## Model Overview
17
 
18
+ This model is a fine-tuned version of [agentlans/deberta-v3-xsmall-zyda-2](https://huggingface.co/agentlans/deberta-v3-xsmall-zyda-2) optimized for sentiment analysis on Twitter data. It achieves the following results on the evaluation set:
 
 
 
19
 
20
+ - Loss: 0.0656
21
+ - MSE: 0.0656
22
 
23
+ ## Dataset
24
 
25
+ The model was trained on the [Twitter Sentiment Meta-Analysis Dataset](https://huggingface.co/datasets/agentlans/twitter-sentiment-meta-analysis).
26
 
27
+ ### Dataset Description
28
 
29
+ This dataset contains sentiment analysis results for English tweets collected between September 2009 and January 2010. The tweets were processed and analyzed using 10 different sentiment classifiers, with the final sentiment score derived from principal component analysis (PCA).
30
 
31
+ - **Source**: Cheng-Caverlee-Lee Twitter Scrape (Sept 2009 - Jan 2010)
32
+ - **Size**: 138,690 tweets
33
+ - **Language**: English only (filtered using langdetect)
34
 
35
+ ## Usage
36
 
37
+ Here's an example of how to use the model for sentiment prediction:
38
 
39
+ ```
40
+ import torch
41
+ from transformers import AutoModelForSequenceClassification, AutoTokenizer
 
 
 
 
 
42
 
43
+ # Load model and tokenizer
44
+ device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
45
+ model_name = "agentlans/deberta-v3-xsmall-zyda-2-sentiment"
46
+ model = AutoModelForSequenceClassification.from_pretrained(model_name, num_labels=1).to(device)
47
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
48
 
49
+ # Function to perform inference
50
+ def predict_score(text):
51
+ inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True).to(device)
52
+ with torch.no_grad():
53
+ logits = model(**inputs).logits
54
+ return logits.item()
55
+
56
+ # Example usage
57
+ input_text = "I accidentally the whole thing. Is that bad?"
58
+ score = predict_score(input_text)
59
+ print(f"Predicted score: {score}")
60
+ ```
61
+
62
+ ## Example Outputs
63
+
64
+ | Text | Sentiment |
65
+ |------|----------:|
66
+ | Nothing seems to go right, and I'm constantly frustrated. | -2.25 |
67
+ | Everything is falling apart, and I can't see any way out. | -2.02 |
68
+ | I feel completely overwhelmed by the challenges I face. | -1.62 |
69
+ | There are some minor improvements, but overall, things are still tough. | -0.81 |
70
+ | I can see a glimmer of hope amidst the difficulties I encounter. | 1.03 |
71
+ | Things are starting to look up, and I'm cautiously optimistic. | 2.06 |
72
+ | There are many good things happening, and I appreciate them. | 2.23 |
73
+ | I'm feeling more positive about my situation than I have in a while. | 2.39 |
74
+ | Every day brings new joy and possibilities; I feel truly blessed. | 2.54 |
75
+ | Life is full of opportunities, and I'm excited about the future. | 2.56 |
76
+
77
+ ## Training Procedure
78
 
79
+ ### Hyperparameters
80
+
81
+ - Learning rate: 5e-05
82
+ - Train batch size: 64
83
+ - Eval batch size: 8
84
+ - Seed: 42
85
+ - Optimizer: AdamW with betas=(0.9, 0.999) and epsilon=1e-08
86
+ - LR scheduler: Linear
87
+ - Number of epochs: 3.0
88
+
89
+ ### Training Results
90
+
91
+ | Training Loss | Epoch | Step | Validation Loss | MSE |
92
+ |:-------------:|:-----:|:----:|:---------------:|:------:|
93
+ | 0.0792 | 1.0 | 2011 | 0.0871 | 0.0871 |
94
+ | 0.0541 | 2.0 | 4022 | 0.0691 | 0.0691 |
95
+ | 0.0411 | 3.0 | 6033 | 0.0656 | 0.0656 |
96
 
97
+ ## Framework Versions
98
 
99
+ - Transformers: 4.46.3
100
+ - PyTorch: 2.5.1+cu124
101
+ - Datasets: 3.1.0
102
+ - Tokenizers: 0.20.3
all_results.json CHANGED
@@ -1,15 +1,15 @@
1
  {
2
  "epoch": 3.0,
3
- "eval_loss": 0.04927213117480278,
4
- "eval_mse": 0.049272132016595305,
5
- "eval_runtime": 10.4326,
6
  "eval_samples": 10000,
7
- "eval_samples_per_second": 958.536,
8
- "eval_steps_per_second": 119.817,
9
- "total_flos": 9935679003367680.0,
10
- "train_loss": 0.05866297316179509,
11
- "train_runtime": 1207.603,
12
- "train_samples": 201105,
13
- "train_samples_per_second": 499.597,
14
- "train_steps_per_second": 7.808
15
  }
 
1
  {
2
  "epoch": 3.0,
3
+ "eval_loss": 0.06556913256645203,
4
+ "eval_mse": 0.06556913494220615,
5
+ "eval_runtime": 13.1744,
6
  "eval_samples": 10000,
7
+ "eval_samples_per_second": 759.049,
8
+ "eval_steps_per_second": 94.881,
9
+ "total_flos": 6357984788759040.0,
10
+ "train_loss": 0.07220485827706652,
11
+ "train_runtime": 846.782,
12
+ "train_samples": 128690,
13
+ "train_samples_per_second": 455.926,
14
+ "train_steps_per_second": 7.125
15
  }
eval_results.json CHANGED
@@ -1,9 +1,9 @@
1
  {
2
  "epoch": 3.0,
3
- "eval_loss": 0.04927213117480278,
4
- "eval_mse": 0.049272132016595305,
5
- "eval_runtime": 10.4326,
6
  "eval_samples": 10000,
7
- "eval_samples_per_second": 958.536,
8
- "eval_steps_per_second": 119.817
9
  }
 
1
  {
2
  "epoch": 3.0,
3
+ "eval_loss": 0.06556913256645203,
4
+ "eval_mse": 0.06556913494220615,
5
+ "eval_runtime": 13.1744,
6
  "eval_samples": 10000,
7
+ "eval_samples_per_second": 759.049,
8
+ "eval_steps_per_second": 94.881
9
  }
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:64bb0296f7b370ce35b0666cd7f26fb0bb06c64245a16e871885925e52a90f49
3
  size 283345892
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2ca36fde7f77cd9138373636d634d704dc626ed3f64e5adca78c6790760099f0
3
  size 283345892
train_results.json CHANGED
@@ -1,9 +1,9 @@
1
  {
2
  "epoch": 3.0,
3
- "total_flos": 9935679003367680.0,
4
- "train_loss": 0.05866297316179509,
5
- "train_runtime": 1207.603,
6
- "train_samples": 201105,
7
- "train_samples_per_second": 499.597,
8
- "train_steps_per_second": 7.808
9
  }
 
1
  {
2
  "epoch": 3.0,
3
+ "total_flos": 6357984788759040.0,
4
+ "train_loss": 0.07220485827706652,
5
+ "train_runtime": 846.782,
6
+ "train_samples": 128690,
7
+ "train_samples_per_second": 455.926,
8
+ "train_steps_per_second": 7.125
9
  }
trainer_state.json CHANGED
@@ -1,178 +1,136 @@
1
  {
2
- "best_metric": 0.04927213117480278,
3
- "best_model_checkpoint": "deberta-v3-xsmall-zyda-2-sentiment/checkpoint-6286",
4
  "epoch": 3.0,
5
  "eval_steps": 500,
6
- "global_step": 9429,
7
  "is_hyper_param_search": false,
8
  "is_local_process_zero": true,
9
  "is_world_process_zero": true,
10
  "log_history": [
11
  {
12
- "epoch": 0.1590836780146357,
13
- "grad_norm": 1.8468247652053833,
14
- "learning_rate": 4.7348605366422736e-05,
15
- "loss": 0.2115,
16
  "step": 500
17
  },
18
  {
19
- "epoch": 0.3181673560292714,
20
- "grad_norm": 1.7370903491973877,
21
- "learning_rate": 4.4697210732845476e-05,
22
- "loss": 0.101,
23
  "step": 1000
24
  },
25
  {
26
- "epoch": 0.4772510340439071,
27
- "grad_norm": 1.7206146717071533,
28
- "learning_rate": 4.2045816099268216e-05,
29
- "loss": 0.0846,
30
  "step": 1500
31
  },
32
  {
33
- "epoch": 0.6363347120585428,
34
- "grad_norm": 1.1373802423477173,
35
- "learning_rate": 3.9394421465690956e-05,
36
- "loss": 0.0748,
37
  "step": 2000
38
  },
39
  {
40
- "epoch": 0.7954183900731785,
41
- "grad_norm": 0.9603880047798157,
42
- "learning_rate": 3.674302683211369e-05,
43
- "loss": 0.0691,
 
 
 
 
 
 
 
 
 
44
  "step": 2500
45
  },
46
  {
47
- "epoch": 0.9545020680878142,
48
- "grad_norm": 1.0165342092514038,
49
- "learning_rate": 3.409163219853643e-05,
50
- "loss": 0.0627,
51
  "step": 3000
52
  },
53
  {
54
- "epoch": 1.0,
55
- "eval_loss": 0.06652908027172089,
56
- "eval_mse": 0.06652908171153529,
57
- "eval_runtime": 10.5244,
58
- "eval_samples_per_second": 950.17,
59
- "eval_steps_per_second": 118.771,
60
- "step": 3143
61
- },
62
- {
63
- "epoch": 1.1135857461024499,
64
- "grad_norm": 0.9926055073738098,
65
- "learning_rate": 3.144023756495917e-05,
66
- "loss": 0.0522,
67
  "step": 3500
68
  },
69
  {
70
- "epoch": 1.2726694241170855,
71
- "grad_norm": 1.247205376625061,
72
- "learning_rate": 2.878884293138191e-05,
73
- "loss": 0.0485,
74
  "step": 4000
75
  },
76
  {
77
- "epoch": 1.4317531021317214,
78
- "grad_norm": 1.7589031457901,
79
- "learning_rate": 2.6137448297804644e-05,
80
- "loss": 0.0463,
 
 
 
 
 
 
 
 
 
81
  "step": 4500
82
  },
83
  {
84
- "epoch": 1.590836780146357,
85
- "grad_norm": 0.7484694719314575,
86
- "learning_rate": 2.3486053664227384e-05,
87
- "loss": 0.0443,
88
  "step": 5000
89
  },
90
  {
91
- "epoch": 1.7499204581609926,
92
- "grad_norm": 1.5068027973175049,
93
- "learning_rate": 2.083465903065012e-05,
94
- "loss": 0.0421,
95
  "step": 5500
96
  },
97
  {
98
- "epoch": 1.9090041361756285,
99
- "grad_norm": 0.832625150680542,
100
- "learning_rate": 1.818326439707286e-05,
101
  "loss": 0.0411,
102
  "step": 6000
103
  },
104
- {
105
- "epoch": 2.0,
106
- "eval_loss": 0.04927213117480278,
107
- "eval_mse": 0.049272132016595305,
108
- "eval_runtime": 11.3101,
109
- "eval_samples_per_second": 884.162,
110
- "eval_steps_per_second": 110.52,
111
- "step": 6286
112
- },
113
- {
114
- "epoch": 2.068087814190264,
115
- "grad_norm": 0.6708300709724426,
116
- "learning_rate": 1.5531869763495598e-05,
117
- "loss": 0.0387,
118
- "step": 6500
119
- },
120
- {
121
- "epoch": 2.2271714922048997,
122
- "grad_norm": 0.6490187644958496,
123
- "learning_rate": 1.2880475129918337e-05,
124
- "loss": 0.0337,
125
- "step": 7000
126
- },
127
- {
128
- "epoch": 2.3862551702195356,
129
- "grad_norm": 0.7127770185470581,
130
- "learning_rate": 1.0229080496341075e-05,
131
- "loss": 0.0324,
132
- "step": 7500
133
- },
134
- {
135
- "epoch": 2.545338848234171,
136
- "grad_norm": 0.6604452133178711,
137
- "learning_rate": 7.5776858627638146e-06,
138
- "loss": 0.0326,
139
- "step": 8000
140
- },
141
- {
142
- "epoch": 2.704422526248807,
143
- "grad_norm": 0.5042712092399597,
144
- "learning_rate": 4.926291229186552e-06,
145
- "loss": 0.0323,
146
- "step": 8500
147
- },
148
- {
149
- "epoch": 2.8635062042634427,
150
- "grad_norm": 0.573316752910614,
151
- "learning_rate": 2.2748965956092908e-06,
152
- "loss": 0.0321,
153
- "step": 9000
154
- },
155
  {
156
  "epoch": 3.0,
157
- "eval_loss": 0.05235280096530914,
158
- "eval_mse": 0.05235280389813637,
159
- "eval_runtime": 10.3984,
160
- "eval_samples_per_second": 961.689,
161
- "eval_steps_per_second": 120.211,
162
- "step": 9429
163
  },
164
  {
165
  "epoch": 3.0,
166
- "step": 9429,
167
- "total_flos": 9935679003367680.0,
168
- "train_loss": 0.05866297316179509,
169
- "train_runtime": 1207.603,
170
- "train_samples_per_second": 499.597,
171
- "train_steps_per_second": 7.808
172
  }
173
  ],
174
  "logging_steps": 500,
175
- "max_steps": 9429,
176
  "num_input_tokens_seen": 0,
177
  "num_train_epochs": 3,
178
  "save_steps": 500,
@@ -188,7 +146,7 @@
188
  "attributes": {}
189
  }
190
  },
191
- "total_flos": 9935679003367680.0,
192
  "train_batch_size": 64,
193
  "trial_name": null,
194
  "trial_params": null
 
1
  {
2
+ "best_metric": 0.06556913256645203,
3
+ "best_model_checkpoint": "deberta-v3-xsmall-zyda-2-transformed-sentiment-new/checkpoint-6033",
4
  "epoch": 3.0,
5
  "eval_steps": 500,
6
+ "global_step": 6033,
7
  "is_hyper_param_search": false,
8
  "is_local_process_zero": true,
9
  "is_world_process_zero": true,
10
  "log_history": [
11
  {
12
+ "epoch": 0.2486325211337643,
13
+ "grad_norm": 2.0000367164611816,
14
+ "learning_rate": 4.5856124647770596e-05,
15
+ "loss": 0.2003,
16
  "step": 500
17
  },
18
  {
19
+ "epoch": 0.4972650422675286,
20
+ "grad_norm": 2.3387935161590576,
21
+ "learning_rate": 4.17122492955412e-05,
22
+ "loss": 0.1052,
23
  "step": 1000
24
  },
25
  {
26
+ "epoch": 0.7458975634012929,
27
+ "grad_norm": 1.853918194770813,
28
+ "learning_rate": 3.7568373943311785e-05,
29
+ "loss": 0.085,
30
  "step": 1500
31
  },
32
  {
33
+ "epoch": 0.9945300845350572,
34
+ "grad_norm": 1.7671293020248413,
35
+ "learning_rate": 3.342449859108238e-05,
36
+ "loss": 0.0792,
37
  "step": 2000
38
  },
39
  {
40
+ "epoch": 1.0,
41
+ "eval_loss": 0.08709739148616791,
42
+ "eval_mse": 0.08709739712527088,
43
+ "eval_runtime": 14.8419,
44
+ "eval_samples_per_second": 673.767,
45
+ "eval_steps_per_second": 84.221,
46
+ "step": 2011
47
+ },
48
+ {
49
+ "epoch": 1.2431626056688214,
50
+ "grad_norm": 1.0026581287384033,
51
+ "learning_rate": 2.928062323885298e-05,
52
+ "loss": 0.0594,
53
  "step": 2500
54
  },
55
  {
56
+ "epoch": 1.4917951268025857,
57
+ "grad_norm": 0.9303980469703674,
58
+ "learning_rate": 2.5136747886623573e-05,
59
+ "loss": 0.0594,
60
  "step": 3000
61
  },
62
  {
63
+ "epoch": 1.74042764793635,
64
+ "grad_norm": 1.7368980646133423,
65
+ "learning_rate": 2.0992872534394168e-05,
66
+ "loss": 0.0551,
 
 
 
 
 
 
 
 
 
67
  "step": 3500
68
  },
69
  {
70
+ "epoch": 1.9890601690701144,
71
+ "grad_norm": 0.6475295424461365,
72
+ "learning_rate": 1.684899718216476e-05,
73
+ "loss": 0.0541,
74
  "step": 4000
75
  },
76
  {
77
+ "epoch": 2.0,
78
+ "eval_loss": 0.06912554055452347,
79
+ "eval_mse": 0.06912553393413896,
80
+ "eval_runtime": 13.2293,
81
+ "eval_samples_per_second": 755.898,
82
+ "eval_steps_per_second": 94.487,
83
+ "step": 4022
84
+ },
85
+ {
86
+ "epoch": 2.2376926902038785,
87
+ "grad_norm": 0.6805059909820557,
88
+ "learning_rate": 1.2705121829935357e-05,
89
+ "loss": 0.0444,
90
  "step": 4500
91
  },
92
  {
93
+ "epoch": 2.486325211337643,
94
+ "grad_norm": 1.3735737800598145,
95
+ "learning_rate": 8.56124647770595e-06,
96
+ "loss": 0.043,
97
  "step": 5000
98
  },
99
  {
100
+ "epoch": 2.734957732471407,
101
+ "grad_norm": 0.9396611452102661,
102
+ "learning_rate": 4.417371125476545e-06,
103
+ "loss": 0.0422,
104
  "step": 5500
105
  },
106
  {
107
+ "epoch": 2.9835902536051715,
108
+ "grad_norm": 0.756208062171936,
109
+ "learning_rate": 2.7349577324714074e-07,
110
  "loss": 0.0411,
111
  "step": 6000
112
  },
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
113
  {
114
  "epoch": 3.0,
115
+ "eval_loss": 0.06556913256645203,
116
+ "eval_mse": 0.06556913494220615,
117
+ "eval_runtime": 13.2288,
118
+ "eval_samples_per_second": 755.924,
119
+ "eval_steps_per_second": 94.491,
120
+ "step": 6033
121
  },
122
  {
123
  "epoch": 3.0,
124
+ "step": 6033,
125
+ "total_flos": 6357984788759040.0,
126
+ "train_loss": 0.07220485827706652,
127
+ "train_runtime": 846.782,
128
+ "train_samples_per_second": 455.926,
129
+ "train_steps_per_second": 7.125
130
  }
131
  ],
132
  "logging_steps": 500,
133
+ "max_steps": 6033,
134
  "num_input_tokens_seen": 0,
135
  "num_train_epochs": 3,
136
  "save_steps": 500,
 
146
  "attributes": {}
147
  }
148
  },
149
+ "total_flos": 6357984788759040.0,
150
  "train_batch_size": 64,
151
  "trial_name": null,
152
  "trial_params": null
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:4a3b335a637ee1abd3e5da0d4f9e8ac74f5cd424a9c44c9efbe15e454133d934
3
  size 5368
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9c472a73a883ba5245b32b70e114642c495e951ce29acca84c258c8a402b2a81
3
  size 5368