Initial Commit

Browse files

Files changed (5) hide show

README.md +116 -0
config.json +159 -0
eval_results_ml.json +1 -0
pytorch_model.bin +3 -0
training_args.bin +3 -0

README.md ADDED Viewed

	@@ -0,0 +1,116 @@

+---
+license: mit
+base_model: haryoaw/scenario-MDBT-TCR_data-cl-massive_all_1_1
+tags:
+- generated_from_trainer
+datasets:
+- massive
+metrics:
+- accuracy
+- f1
+model-index:
+- name: scenario-KD-PO-MSV-CL-D2_data-cl-massive_all_1_166
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# scenario-KD-PO-MSV-CL-D2_data-cl-massive_all_1_166
+This model is a fine-tuned version of [haryoaw/scenario-MDBT-TCR_data-cl-massive_all_1_1](https://huggingface.co/haryoaw/scenario-MDBT-TCR_data-cl-massive_all_1_1) on the massive dataset.
+It achieves the following results on the evaluation set:
+- Loss: 6.0186
+- Accuracy: 0.6461
+- F1: 0.6134
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 5e-05
+- train_batch_size: 32
+- eval_batch_size: 32
+- seed: 66
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- num_epochs: 30
+### Training results
+| Training Loss | Epoch | Step   | Validation Loss | Accuracy | F1     |
+|:-------------:|:-----:|:------:|:---------------:|:--------:|:------:|
+| 2.2524        | 0.56  | 5000   | 5.6187          | 0.6299   | 0.5728 |
+| 1.3325        | 1.11  | 10000  | 5.4671          | 0.6450   | 0.5924 |
+| 1.2156        | 1.67  | 15000  | 6.0747          | 0.6250   | 0.5912 |
+| 0.8855        | 2.22  | 20000  | 5.8471          | 0.6355   | 0.5857 |
+| 0.8518        | 2.78  | 25000  | 6.2545          | 0.6303   | 0.5845 |
+| 0.6853        | 3.33  | 30000  | 6.0057          | 0.6408   | 0.6017 |
+| 0.6658        | 3.89  | 35000  | 6.0161          | 0.6423   | 0.6002 |
+| 0.5544        | 4.45  | 40000  | 6.0854          | 0.6392   | 0.6006 |
+| 0.5357        | 5.0   | 45000  | 6.2732          | 0.6283   | 0.5888 |
+| 0.4924        | 5.56  | 50000  | 6.4624          | 0.6277   | 0.5952 |
+| 0.4369        | 6.11  | 55000  | 6.2119          | 0.6354   | 0.5944 |
+| 0.4276        | 6.67  | 60000  | 6.2395          | 0.6425   | 0.6006 |
+| 0.3974        | 7.23  | 65000  | 6.6542          | 0.6264   | 0.5893 |
+| 0.404         | 7.78  | 70000  | 6.4174          | 0.6295   | 0.5975 |
+| 0.3763        | 8.34  | 75000  | 6.1405          | 0.6426   | 0.6025 |
+| 0.3719        | 8.89  | 80000  | 6.4745          | 0.6346   | 0.6024 |
+| 0.3428        | 9.45  | 85000  | 5.9964          | 0.6389   | 0.6030 |
+| 0.3288        | 10.0  | 90000  | 6.3213          | 0.6335   | 0.5988 |
+| 0.3192        | 10.56 | 95000  | 6.4269          | 0.6321   | 0.5937 |
+| 0.2934        | 11.12 | 100000 | 6.3224          | 0.6392   | 0.6039 |
+| 0.3054        | 11.67 | 105000 | 6.4531          | 0.6326   | 0.5989 |
+| 0.2841        | 12.23 | 110000 | 6.2824          | 0.6360   | 0.6075 |
+| 0.2915        | 12.78 | 115000 | 6.1928          | 0.6391   | 0.6039 |
+| 0.274         | 13.34 | 120000 | 6.1931          | 0.6401   | 0.6030 |
+| 0.2776        | 13.9  | 125000 | 6.2524          | 0.6384   | 0.6045 |
+| 0.2724        | 14.45 | 130000 | 5.9260          | 0.6456   | 0.6090 |
+| 0.2602        | 15.01 | 135000 | 6.3508          | 0.6347   | 0.6052 |
+| 0.2627        | 15.56 | 140000 | 6.1761          | 0.6421   | 0.6074 |
+| 0.2496        | 16.12 | 145000 | 6.1398          | 0.6391   | 0.6111 |
+| 0.253         | 16.67 | 150000 | 6.2431          | 0.6328   | 0.6014 |
+| 0.2451        | 17.23 | 155000 | 6.1746          | 0.6378   | 0.6048 |
+| 0.2369        | 17.79 | 160000 | 6.0915          | 0.6435   | 0.6103 |
+| 0.2332        | 18.34 | 165000 | 6.2138          | 0.6376   | 0.6071 |
+| 0.2325        | 18.9  | 170000 | 6.1176          | 0.6433   | 0.6073 |
+| 0.2239        | 19.45 | 175000 | 5.9650          | 0.6419   | 0.6068 |
+| 0.2229        | 20.01 | 180000 | 6.2025          | 0.6395   | 0.6072 |
+| 0.2241        | 20.56 | 185000 | 6.0510          | 0.6418   | 0.6088 |
+| 0.212         | 21.12 | 190000 | 5.9952          | 0.6438   | 0.6100 |
+| 0.218         | 21.68 | 195000 | 6.2810          | 0.6376   | 0.6073 |
+| 0.212         | 22.23 | 200000 | 5.9274          | 0.6454   | 0.6076 |
+| 0.2091        | 22.79 | 205000 | 6.1958          | 0.6367   | 0.6071 |
+| 0.2091        | 23.34 | 210000 | 5.9633          | 0.6463   | 0.6153 |
+| 0.2065        | 23.9  | 215000 | 6.0132          | 0.6458   | 0.6116 |
+| 0.2048        | 24.46 | 220000 | 5.9809          | 0.6451   | 0.6132 |
+| 0.1996        | 25.01 | 225000 | 6.1021          | 0.6389   | 0.6063 |
+| 0.1966        | 25.57 | 230000 | 5.9612          | 0.6448   | 0.6140 |
+| 0.1964        | 26.12 | 235000 | 6.0715          | 0.6434   | 0.6134 |
+| 0.1971        | 26.68 | 240000 | 6.0237          | 0.6442   | 0.6127 |
+| 0.1893        | 27.23 | 245000 | 6.0213          | 0.6418   | 0.6086 |
+| 0.1891        | 27.79 | 250000 | 6.0386          | 0.6445   | 0.6127 |
+| 0.1942        | 28.35 | 255000 | 6.0043          | 0.6428   | 0.6099 |
+| 0.1966        | 28.9  | 260000 | 5.9983          | 0.6440   | 0.6130 |
+| 0.1883        | 29.46 | 265000 | 6.0186          | 0.6461   | 0.6134 |
+### Framework versions
+- Transformers 4.33.3
+- Pytorch 2.1.1+cu121
+- Datasets 2.14.5
+- Tokenizers 0.13.3

config.json ADDED Viewed

	@@ -0,0 +1,159 @@

+{
+  "_name_or_path": "haryoaw/scenario-MDBT-TCR_data-cl-massive_all_1_1",
+  "architectures": [
+    "DebertaForSequenceClassificationKD"
+  ],
+  "attention_probs_dropout_prob": 0.1,
+  "hidden_act": "gelu",
+  "hidden_dropout_prob": 0.1,
+  "hidden_size": 768,
+  "id2label": {
+    "0": "LABEL_0",
+    "1": "LABEL_1",
+    "2": "LABEL_2",
+    "3": "LABEL_3",
+    "4": "LABEL_4",
+    "5": "LABEL_5",
+    "6": "LABEL_6",
+    "7": "LABEL_7",
+    "8": "LABEL_8",
+    "9": "LABEL_9",
+    "10": "LABEL_10",
+    "11": "LABEL_11",
+    "12": "LABEL_12",
+    "13": "LABEL_13",
+    "14": "LABEL_14",
+    "15": "LABEL_15",
+    "16": "LABEL_16",
+    "17": "LABEL_17",
+    "18": "LABEL_18",
+    "19": "LABEL_19",
+    "20": "LABEL_20",
+    "21": "LABEL_21",
+    "22": "LABEL_22",
+    "23": "LABEL_23",
+    "24": "LABEL_24",
+    "25": "LABEL_25",
+    "26": "LABEL_26",
+    "27": "LABEL_27",
+    "28": "LABEL_28",
+    "29": "LABEL_29",
+    "30": "LABEL_30",
+    "31": "LABEL_31",
+    "32": "LABEL_32",
+    "33": "LABEL_33",
+    "34": "LABEL_34",
+    "35": "LABEL_35",
+    "36": "LABEL_36",
+    "37": "LABEL_37",
+    "38": "LABEL_38",
+    "39": "LABEL_39",
+    "40": "LABEL_40",
+    "41": "LABEL_41",
+    "42": "LABEL_42",
+    "43": "LABEL_43",
+    "44": "LABEL_44",
+    "45": "LABEL_45",
+    "46": "LABEL_46",
+    "47": "LABEL_47",
+    "48": "LABEL_48",
+    "49": "LABEL_49",
+    "50": "LABEL_50",
+    "51": "LABEL_51",
+    "52": "LABEL_52",
+    "53": "LABEL_53",
+    "54": "LABEL_54",
+    "55": "LABEL_55",
+    "56": "LABEL_56",
+    "57": "LABEL_57",
+    "58": "LABEL_58",
+    "59": "LABEL_59"
+  },
+  "initializer_range": 0.02,
+  "intermediate_size": 3072,
+  "label2id": {
+    "LABEL_0": 0,
+    "LABEL_1": 1,
+    "LABEL_10": 10,
+    "LABEL_11": 11,
+    "LABEL_12": 12,
+    "LABEL_13": 13,
+    "LABEL_14": 14,
+    "LABEL_15": 15,
+    "LABEL_16": 16,
+    "LABEL_17": 17,
+    "LABEL_18": 18,
+    "LABEL_19": 19,
+    "LABEL_2": 2,
+    "LABEL_20": 20,
+    "LABEL_21": 21,
+    "LABEL_22": 22,
+    "LABEL_23": 23,
+    "LABEL_24": 24,
+    "LABEL_25": 25,
+    "LABEL_26": 26,
+    "LABEL_27": 27,
+    "LABEL_28": 28,
+    "LABEL_29": 29,
+    "LABEL_3": 3,
+    "LABEL_30": 30,
+    "LABEL_31": 31,
+    "LABEL_32": 32,
+    "LABEL_33": 33,
+    "LABEL_34": 34,
+    "LABEL_35": 35,
+    "LABEL_36": 36,
+    "LABEL_37": 37,
+    "LABEL_38": 38,
+    "LABEL_39": 39,
+    "LABEL_4": 4,
+    "LABEL_40": 40,
+    "LABEL_41": 41,
+    "LABEL_42": 42,
+    "LABEL_43": 43,
+    "LABEL_44": 44,
+    "LABEL_45": 45,
+    "LABEL_46": 46,
+    "LABEL_47": 47,
+    "LABEL_48": 48,
+    "LABEL_49": 49,
+    "LABEL_5": 5,
+    "LABEL_50": 50,
+    "LABEL_51": 51,
+    "LABEL_52": 52,
+    "LABEL_53": 53,
+    "LABEL_54": 54,
+    "LABEL_55": 55,
+    "LABEL_56": 56,
+    "LABEL_57": 57,
+    "LABEL_58": 58,
+    "LABEL_59": 59,
+    "LABEL_6": 6,
+    "LABEL_7": 7,
+    "LABEL_8": 8,
+    "LABEL_9": 9
+  },
+  "layer_norm_eps": 1e-07,
+  "max_position_embeddings": 512,
+  "max_relative_positions": -1,
+  "model_type": "deberta-v2",
+  "norm_rel_ebd": "layer_norm",
+  "num_attention_heads": 12,
+  "num_hidden_layers": 6,
+  "pad_token_id": 0,
+  "pooler_dropout": 0,
+  "pooler_hidden_act": "gelu",
+  "pooler_hidden_size": 768,
+  "pos_att_type": [
+    "p2c",
+    "c2p"
+  ],
+  "position_biased_input": false,
+  "position_buckets": 256,
+  "relative_attention": true,
+  "share_att_key": true,
+  "torch_dtype": "float32",
+  "transformers_version": "4.33.3",
+  "type_vocab_size": 0,
+  "vocab_size": 251000
+}

eval_results_ml.json ADDED Viewed

	@@ -0,0 +1 @@

+ {"ar-SA": {"f1": 0.7720010663747154, "accuracy": 0.8264963012777404}, "he-IL": {"f1": 0.5745824369450304, "accuracy": 0.6610625420309347}, "pt-PT": {"f1": 0.8324502237245487, "accuracy": 0.875252185608608}, "fi-FI": {"f1": 0.8154091746380266, "accuracy": 0.8607935440484197}, "kn-IN": {"f1": 0.5780343670831116, "accuracy": 0.6506388702084734}, "ca-ES": {"f1": 0.7358548975506386, "accuracy": 0.7780766644250168}, "da-DK": {"f1": 0.737668214092637, "accuracy": 0.8029589778076665}, "ro-RO": {"f1": 0.6477681422714716, "accuracy": 0.7246133154001345}, "pl-PL": {"f1": 0.6621571909069583, "accuracy": 0.7609280430396772}, "en-US": {"f1": 0.8506893531708493, "accuracy": 0.8930733019502354}, "de-DE": {"f1": 0.8190904756196555, "accuracy": 0.8705447209145931}, "ms-MY": {"f1": 0.6936443014436624, "accuracy": 0.7589105581708138}, "jv-ID": {"f1": 0.8272182809538818, "accuracy": 0.8597848016139878}, "ta-IN": {"f1": 0.5849528463090181, "accuracy": 0.6792199058507061}, "hu-HU": {"f1": 0.8235842095940068, "accuracy": 0.8742434431741762}, "id-ID": {"f1": 0.8277902807365753, "accuracy": 0.8772696704774714}, "th-TH": {"f1": 0.7601694366157717, "accuracy": 0.7881640887693342}, "ko-KR": {"f1": 0.8258340149702332, "accuracy": 0.8739071956960323}, "tl-PH": {"f1": 0.41221233107481825, "accuracy": 0.46234028244788167}, "bn-BD": {"f1": 0.8222195031892248, "accuracy": 0.8547410894418291}, "az-AZ": {"f1": 0.6379146919077101, "accuracy": 0.7064559515803631}, "zh-TW": {"f1": 0.8219476802602138, "accuracy": 0.85137861466039}, "cy-GB": {"f1": 0.15360224910725706, "accuracy": 0.23806321452589105}, "sq-AL": {"f1": 0.5203060184356132, "accuracy": 0.562542030934768}, "ru-RU": {"f1": 0.8360932583374457, "accuracy": 0.879287155346335}, "af-ZA": {"f1": 0.580146501173271, "accuracy": 0.6492938802958977}, "fr-FR": {"f1": 0.8321333798384362, "accuracy": 0.878950907868191}, "ka-GE": {"f1": 0.783785888663372, "accuracy": 0.8221250840618696}, "is-IS": {"f1": 0.8202059436401666, "accuracy": 0.8651647612642905}, "sw-KE": {"f1": 0.38820124799934286, "accuracy": 0.4589778076664425}, "hi-IN": {"f1": 0.8122286015860652, "accuracy": 0.8655010087424344}, "km-KH": {"f1": 0.557996205283102, "accuracy": 0.6304640215198386}, "lv-LV": {"f1": 0.8434795040666275, "accuracy": 0.8708809683927371}, "sl-SL": {"f1": 0.546698528124086, "accuracy": 0.5998655010087425}, "am-ET": {"f1": 0.21524690087379741, "accuracy": 0.28681909885675855}, "sv-SE": {"f1": 0.7274340266407211, "accuracy": 0.8016139878950908}, "mn-MN": {"f1": 0.4589313676752772, "accuracy": 0.5336247478143914}, "my-MM": {"f1": 0.8163612617095788, "accuracy": 0.8604572965702757}, "ja-JP": {"f1": 0.8502108894924849, "accuracy": 0.8806321452589105}, "ur-PK": {"f1": 0.4160032989982115, "accuracy": 0.4761264290517821}, "it-IT": {"f1": 0.7211923887655927, "accuracy": 0.796906523201076}, "nb-NO": {"f1": 0.753965178683486, "accuracy": 0.8080026899798252}, "te-IN": {"f1": 0.5170677460146224, "accuracy": 0.6116341627437795}, "zh-CN": {"f1": 0.8207813034678526, "accuracy": 0.8644922663080027}, "fa-IR": {"f1": 0.6535974427037031, "accuracy": 0.734364492266308}, "tr-TR": {"f1": 0.8210590707934682, "accuracy": 0.8705447209145931}, "vi-VN": {"f1": 0.8224631444887968, "accuracy": 0.8695359784801614}, "nl-NL": {"f1": 0.7347664077644604, "accuracy": 0.8093476798924009}, "es-ES": {"f1": 0.8385495149569755, "accuracy": 0.8765971755211835}, "hy-AM": {"f1": 0.8048019679896729, "accuracy": 0.8597848016139878}, "el-GR": {"f1": 0.8213653641875498, "accuracy": 0.867182246133154}, "ml-IN": {"f1": 0.5759394269035711, "accuracy": 0.6872898453261601}, "all": {"f1": 0.7090860083539934, "accuracy": 0.7522114737985619}}

pytorch_model.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e221f49cbacf50a59444f5848e249da7b9959cd435f291fe75a68f1b0f615871
+size 946915690

training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:54fbaee59da33615188679b7bc74c8f0a604f47193908380a5a628aab16bd662
+size 4600