File size: 6,010 Bytes
a5d8e1f
 
1f30202
 
a5d8e1f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1f30202
a5d8e1f
1f30202
 
 
 
 
 
a5d8e1f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1f30202
 
a5d8e1f
 
1f30202
a5d8e1f
 
 
 
1f30202
a5d8e1f
 
 
 
1f30202
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a5d8e1f
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
---
library_name: transformers
license: apache-2.0
base_model: tasksource/deberta-small-long-nli
tags:
- generated_from_trainer
metrics:
- accuracy
- precision
- recall
- f1
model-index:
- name: ms-deberta-v2-xlarge-mnli-finetuned-pt
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# ms-deberta-v2-xlarge-mnli-finetuned-pt

This model is a fine-tuned version of [tasksource/deberta-small-long-nli](https://huggingface.co/tasksource/deberta-small-long-nli) on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 0.2954
- Accuracy: 1.0
- Precision: 1.0
- Recall: 1.0
- F1: 1.0
- Ratio: 0.11

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 32
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.06
- lr_scheduler_warmup_steps: 4
- num_epochs: 1
- label_smoothing_factor: 0.1

### Training results

| Training Loss | Epoch  | Step | Validation Loss | Accuracy | Precision | Recall | F1     | Ratio  |
|:-------------:|:------:|:----:|:---------------:|:--------:|:---------:|:------:|:------:|:------:|
| 1.4129        | 0.0237 | 10   | 0.5425          | 0.89     | 0.445     | 0.5    | 0.4709 | 0.0    |
| 0.5102        | 0.0474 | 20   | 0.4968          | 0.89     | 0.445     | 0.5    | 0.4709 | 0.0    |
| 0.4597        | 0.0711 | 30   | 0.4763          | 0.88     | 0.6225    | 0.5395 | 0.5471 | 0.0327 |
| 0.4975        | 0.0948 | 40   | 0.4605          | 0.87     | 0.6658    | 0.6614 | 0.6636 | 0.1067 |
| 0.4639        | 0.1185 | 50   | 0.4434          | 0.8947   | 0.7355    | 0.5850 | 0.6125 | 0.0367 |
| 0.4687        | 0.1422 | 60   | 0.4557          | 0.892    | 0.7177    | 0.6498 | 0.6747 | 0.0727 |
| 0.4489        | 0.1659 | 70   | 0.4353          | 0.9293   | 0.8174    | 0.8275 | 0.8224 | 0.114  |
| 0.4318        | 0.1896 | 80   | 0.4269          | 0.924    | 0.8010    | 0.8325 | 0.8156 | 0.1233 |
| 0.4723        | 0.2133 | 90   | 0.4202          | 0.9173   | 0.7832    | 0.8580 | 0.8140 | 0.1447 |
| 0.4052        | 0.2370 | 100  | 0.4016          | 0.9307   | 0.8207    | 0.8309 | 0.8257 | 0.114  |
| 0.4284        | 0.2607 | 110  | 0.4115          | 0.9187   | 0.7855    | 0.8906 | 0.8255 | 0.1593 |
| 0.3635        | 0.2844 | 120  | 0.3963          | 0.94     | 0.8308    | 0.9052 | 0.8625 | 0.1393 |
| 0.3894        | 0.3081 | 130  | 0.3910          | 0.944    | 0.8409    | 0.9075 | 0.8699 | 0.1353 |
| 0.3537        | 0.3318 | 140  | 0.3598          | 0.9693   | 0.8983    | 0.9642 | 0.9277 | 0.1313 |
| 0.3776        | 0.3555 | 150  | 0.3868          | 0.944    | 0.8313    | 0.9685 | 0.8823 | 0.166  |
| 0.3626        | 0.3791 | 160  | 0.3235          | 0.9887   | 0.9699    | 0.9724 | 0.9711 | 0.1107 |
| 0.3683        | 0.4028 | 170  | 0.3272          | 0.99     | 0.9583    | 0.9944 | 0.9754 | 0.12   |
| 0.3358        | 0.4265 | 180  | 0.3321          | 0.9873   | 0.9484    | 0.9929 | 0.9692 | 0.1227 |
| 0.3435        | 0.4502 | 190  | 0.3370          | 0.982    | 0.9297    | 0.9899 | 0.9571 | 0.128  |
| 0.3613        | 0.4739 | 200  | 0.3136          | 0.9893   | 0.9728    | 0.9728 | 0.9728 | 0.11   |
| 0.3323        | 0.4976 | 210  | 0.3193          | 0.9887   | 0.9533    | 0.9936 | 0.9723 | 0.1213 |
| 0.3181        | 0.5213 | 220  | 0.3078          | 0.9947   | 0.9970    | 0.9758 | 0.9861 | 0.1047 |
| 0.3043        | 0.5450 | 230  | 0.3047          | 0.9947   | 0.9970    | 0.9758 | 0.9861 | 0.1047 |
| 0.3139        | 0.5687 | 240  | 0.3101          | 0.996    | 0.9825    | 0.9978 | 0.9899 | 0.114  |
| 0.3247        | 0.5924 | 250  | 0.3048          | 0.9947   | 0.9970    | 0.9758 | 0.9861 | 0.1047 |
| 0.3217        | 0.6161 | 260  | 0.3126          | 0.9913   | 0.9635    | 0.9951 | 0.9786 | 0.1187 |
| 0.3071        | 0.6398 | 270  | 0.3021          | 1.0      | 1.0       | 1.0    | 1.0    | 0.11   |
| 0.3048        | 0.6635 | 280  | 0.3048          | 0.9973   | 0.9882    | 0.9985 | 0.9933 | 0.1127 |
| 0.3054        | 0.6872 | 290  | 0.2996          | 1.0      | 1.0       | 1.0    | 1.0    | 0.11   |
| 0.3182        | 0.7109 | 300  | 0.2979          | 1.0      | 1.0       | 1.0    | 1.0    | 0.11   |
| 0.3059        | 0.7346 | 310  | 0.3103          | 0.9927   | 0.9688    | 0.9959 | 0.9818 | 0.1173 |
| 0.3044        | 0.7583 | 320  | 0.2991          | 1.0      | 1.0       | 1.0    | 1.0    | 0.11   |
| 0.3002        | 0.7820 | 330  | 0.2967          | 1.0      | 1.0       | 1.0    | 1.0    | 0.11   |
| 0.2957        | 0.8057 | 340  | 0.2967          | 1.0      | 1.0       | 1.0    | 1.0    | 0.11   |
| 0.2971        | 0.8294 | 350  | 0.2968          | 1.0      | 1.0       | 1.0    | 1.0    | 0.11   |
| 0.2964        | 0.8531 | 360  | 0.2970          | 1.0      | 1.0       | 1.0    | 1.0    | 0.11   |
| 0.297         | 0.8768 | 370  | 0.2969          | 1.0      | 1.0       | 1.0    | 1.0    | 0.11   |
| 0.3039        | 0.9005 | 380  | 0.2968          | 1.0      | 1.0       | 1.0    | 1.0    | 0.11   |
| 0.3002        | 0.9242 | 390  | 0.2960          | 1.0      | 1.0       | 1.0    | 1.0    | 0.11   |
| 0.2968        | 0.9479 | 400  | 0.2956          | 1.0      | 1.0       | 1.0    | 1.0    | 0.11   |
| 0.2956        | 0.9716 | 410  | 0.2955          | 1.0      | 1.0       | 1.0    | 1.0    | 0.11   |
| 0.2959        | 0.9953 | 420  | 0.2954          | 1.0      | 1.0       | 1.0    | 1.0    | 0.11   |


### Framework versions

- Transformers 4.44.2
- Pytorch 2.4.0+cu121
- Datasets 2.21.0
- Tokenizers 0.19.1