ShaunThayil commited on
Commit
570cefa
·
1 Parent(s): d6d9e10

ShaunThayil/roberta_1

Browse files
Files changed (4) hide show
  1. README.md +20 -28
  2. config.json +20 -17
  3. pytorch_model.bin +2 -2
  4. training_args.bin +1 -1
README.md CHANGED
@@ -1,6 +1,6 @@
1
  ---
2
- license: apache-2.0
3
- base_model: distilbert-base-uncased
4
  tags:
5
  - generated_from_trainer
6
  metrics:
@@ -18,13 +18,13 @@ should probably proofread and complete it, then remove this comment. -->
18
 
19
  # training-1
20
 
21
- This model is a fine-tuned version of [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased) on an unknown dataset.
22
  It achieves the following results on the evaluation set:
23
- - Loss: 0.0292
24
- - Accuracy: 0.9940
25
- - Precision: 0.9982
26
- - Recall: 0.9893
27
- - F1: 0.9937
28
 
29
  ## Model description
30
 
@@ -43,7 +43,7 @@ More information needed
43
  ### Training hyperparameters
44
 
45
  The following hyperparameters were used during training:
46
- - learning_rate: 2e-05
47
  - train_batch_size: 8
48
  - eval_batch_size: 8
49
  - seed: 42
@@ -55,29 +55,21 @@ The following hyperparameters were used during training:
55
 
56
  | Training Loss | Epoch | Step | Validation Loss | Accuracy | Precision | Recall | F1 |
57
  |:-------------:|:-----:|:----:|:---------------:|:--------:|:---------:|:------:|:------:|
58
- | No log | 0.25 | 85 | 0.0345 | 0.9931 | 0.9982 | 0.9875 | 0.9928 |
59
- | No log | 0.5 | 170 | 0.0428 | 0.9905 | 1.0 | 0.9804 | 0.9901 |
60
- | No log | 0.75 | 255 | 0.0295 | 0.9940 | 0.9982 | 0.9893 | 0.9937 |
61
- | 0.0811 | 1.0 | 340 | 0.0237 | 0.9957 | 1.0 | 0.9911 | 0.9955 |
62
- | 0.0811 | 1.25 | 425 | 0.0618 | 0.9897 | 1.0 | 0.9786 | 0.9892 |
63
- | 0.0811 | 1.5 | 510 | 0.0338 | 0.9940 | 1.0 | 0.9875 | 0.9937 |
64
- | 0.0811 | 1.76 | 595 | 0.0373 | 0.9931 | 1.0 | 0.9857 | 0.9928 |
65
- | 0.0267 | 2.01 | 680 | 0.0382 | 0.9923 | 0.9982 | 0.9857 | 0.9919 |
66
- | 0.0267 | 2.26 | 765 | 0.0271 | 0.9948 | 1.0 | 0.9893 | 0.9946 |
67
- | 0.0267 | 2.51 | 850 | 0.0355 | 0.9940 | 1.0 | 0.9875 | 0.9937 |
68
- | 0.0267 | 2.76 | 935 | 0.0397 | 0.9940 | 1.0 | 0.9875 | 0.9937 |
69
- | 0.0187 | 3.01 | 1020 | 0.0270 | 0.9940 | 0.9982 | 0.9893 | 0.9937 |
70
- | 0.0187 | 3.26 | 1105 | 0.0246 | 0.9948 | 0.9982 | 0.9911 | 0.9946 |
71
- | 0.0187 | 3.51 | 1190 | 0.0340 | 0.9940 | 1.0 | 0.9875 | 0.9937 |
72
- | 0.0187 | 3.76 | 1275 | 0.0242 | 0.9957 | 1.0 | 0.9911 | 0.9955 |
73
- | 0.0093 | 4.01 | 1360 | 0.0224 | 0.9948 | 0.9982 | 0.9911 | 0.9946 |
74
- | 0.0093 | 4.26 | 1445 | 0.0275 | 0.9940 | 0.9982 | 0.9893 | 0.9937 |
75
- | 0.0093 | 4.51 | 1530 | 0.0285 | 0.9940 | 0.9982 | 0.9893 | 0.9937 |
76
- | 0.0093 | 4.76 | 1615 | 0.0292 | 0.9940 | 0.9982 | 0.9893 | 0.9937 |
77
 
78
 
79
  ### Framework versions
80
 
81
  - Transformers 4.33.1
82
  - Pytorch 2.2.0.dev20230913+cu121
 
83
  - Tokenizers 0.13.3
 
1
  ---
2
+ license: mit
3
+ base_model: roberta-base
4
  tags:
5
  - generated_from_trainer
6
  metrics:
 
18
 
19
  # training-1
20
 
21
+ This model is a fine-tuned version of [roberta-base](https://huggingface.co/roberta-base) on an unknown dataset.
22
  It achieves the following results on the evaluation set:
23
+ - Loss: 0.0448
24
+ - Accuracy: 0.9937
25
+ - Precision: 0.9912
26
+ - Recall: 0.9859
27
+ - F1: 0.9885
28
 
29
  ## Model description
30
 
 
43
  ### Training hyperparameters
44
 
45
  The following hyperparameters were used during training:
46
+ - learning_rate: 1e-05
47
  - train_batch_size: 8
48
  - eval_batch_size: 8
49
  - seed: 42
 
55
 
56
  | Training Loss | Epoch | Step | Validation Loss | Accuracy | Precision | Recall | F1 |
57
  |:-------------:|:-----:|:----:|:---------------:|:--------:|:---------:|:------:|:------:|
58
+ | No log | 0.5 | 302 | 0.0546 | 0.9870 | 0.9737 | 0.9789 | 0.9763 |
59
+ | No log | 1.0 | 604 | 0.0511 | 0.9913 | 0.9911 | 0.9771 | 0.9840 |
60
+ | 0.1032 | 1.5 | 906 | 0.0558 | 0.9899 | 0.9807 | 0.9824 | 0.9815 |
61
+ | 0.1032 | 2.0 | 1208 | 0.0467 | 0.9928 | 0.9982 | 0.9754 | 0.9866 |
62
+ | 0.0353 | 2.5 | 1510 | 0.0411 | 0.9937 | 0.9929 | 0.9842 | 0.9885 |
63
+ | 0.0353 | 3.0 | 1812 | 0.0460 | 0.9932 | 0.9911 | 0.9842 | 0.9876 |
64
+ | 0.0183 | 3.49 | 2114 | 0.0423 | 0.9937 | 0.9947 | 0.9824 | 0.9885 |
65
+ | 0.0183 | 3.99 | 2416 | 0.0476 | 0.9932 | 0.9911 | 0.9842 | 0.9876 |
66
+ | 0.013 | 4.49 | 2718 | 0.0463 | 0.9932 | 0.9911 | 0.9842 | 0.9876 |
67
+ | 0.013 | 4.99 | 3020 | 0.0448 | 0.9937 | 0.9912 | 0.9859 | 0.9885 |
 
 
 
 
 
 
 
 
 
68
 
69
 
70
  ### Framework versions
71
 
72
  - Transformers 4.33.1
73
  - Pytorch 2.2.0.dev20230913+cu121
74
+ - Datasets 2.14.5
75
  - Tokenizers 0.13.3
config.json CHANGED
@@ -1,25 +1,28 @@
1
  {
2
- "_name_or_path": "distilbert-base-uncased",
3
- "activation": "gelu",
4
  "architectures": [
5
- "DistilBertForSequenceClassification"
6
  ],
7
- "attention_dropout": 0.1,
8
- "dim": 768,
9
- "dropout": 0.1,
10
- "hidden_dim": 3072,
 
 
 
11
  "initializer_range": 0.02,
12
- "max_position_embeddings": 512,
13
- "model_type": "distilbert",
14
- "n_heads": 12,
15
- "n_layers": 6,
16
- "pad_token_id": 0,
 
 
 
17
  "problem_type": "single_label_classification",
18
- "qa_dropout": 0.1,
19
- "seq_classif_dropout": 0.2,
20
- "sinusoidal_pos_embds": false,
21
- "tie_weights_": true,
22
  "torch_dtype": "float32",
23
  "transformers_version": "4.33.1",
24
- "vocab_size": 30522
 
 
25
  }
 
1
  {
2
+ "_name_or_path": "roberta-base",
 
3
  "architectures": [
4
+ "RobertaForSequenceClassification"
5
  ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "bos_token_id": 0,
8
+ "classifier_dropout": null,
9
+ "eos_token_id": 2,
10
+ "hidden_act": "gelu",
11
+ "hidden_dropout_prob": 0.1,
12
+ "hidden_size": 768,
13
  "initializer_range": 0.02,
14
+ "intermediate_size": 3072,
15
+ "layer_norm_eps": 1e-05,
16
+ "max_position_embeddings": 514,
17
+ "model_type": "roberta",
18
+ "num_attention_heads": 12,
19
+ "num_hidden_layers": 12,
20
+ "pad_token_id": 1,
21
+ "position_embedding_type": "absolute",
22
  "problem_type": "single_label_classification",
 
 
 
 
23
  "torch_dtype": "float32",
24
  "transformers_version": "4.33.1",
25
+ "type_vocab_size": 1,
26
+ "use_cache": true,
27
+ "vocab_size": 50265
28
  }
pytorch_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:7acd063f486d2307b8db8225437c8fba5d71f6c8b7dc261b5ffad01be0373b61
3
- size 267855978
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ecdb9897e7900218afded54579bf06bd86ee0a57e867f6c8b3378d768f4fe0d2
3
+ size 498658094
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:6805874f93e682a9cec9dc0ea8f1e836bb07d9630cff35857cf4f9d1163e2f2d
3
  size 4472
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:715c5e974e827a8a9120a03d7ad08bfa526ed63cbd16636e26b5ebb3ea3582fb
3
  size 4472