reddgr commited on
Commit
d7517c7
·
verified ·
1 Parent(s): 4011356

Upload TFDistilBertForSequenceClassification

Browse files
Files changed (3) hide show
  1. README.md +57 -7
  2. config.json +23 -23
  3. tf_model.h5 +1 -1
README.md CHANGED
@@ -1,12 +1,62 @@
1
  ---
 
2
  license: apache-2.0
3
- base_model:
4
- - distilbert/distilbert-base-uncased
5
  tags:
6
- - text-classification
7
- - chatbot-prompts
 
 
8
  ---
9
 
10
- This is a fine-tuning of [distilbert-base-uncased](https://huggingface.co/distilbert/distilbert-base-uncased) trained on a small set of manually labeled sentences classified as "instruction" or "problem".
11
- The main purpose is to calculate metrics used by the SCBN-RQTL chatbot response evaluation benchmark. The acronym TL stands for Test vs. Learn and is based on the assumption that prompts labeled as 'instruction' typically reflect the user's intent to 'learn,' while prompts labeled as 'problem' are generally aimed at 'testing' the chatbot by challenging it.
12
- More information in the GitHub repository [here](https://github.com/reddgr/chatbot-response-scoring-scbn-rqtl)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ library_name: transformers
3
  license: apache-2.0
4
+ base_model: distilbert-base-uncased
 
5
  tags:
6
+ - generated_from_keras_callback
7
+ model-index:
8
+ - name: tl-test-learn-prompt-classifier
9
+ results: []
10
  ---
11
 
12
+ <!-- This model card has been generated automatically according to the information Keras had access to. You should
13
+ probably proofread and complete it, then remove this comment. -->
14
+
15
+ # tl-test-learn-prompt-classifier
16
+
17
+ This model is a fine-tuned version of [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased) on an unknown dataset.
18
+ It achieves the following results on the evaluation set:
19
+ - Train Loss: 0.0794
20
+ - Train Accuracy: 1.0
21
+ - Validation Loss: 0.2381
22
+ - Validation Accuracy: 0.9444
23
+ - Epoch: 5
24
+
25
+ ## Model description
26
+
27
+ More information needed
28
+
29
+ ## Intended uses & limitations
30
+
31
+ More information needed
32
+
33
+ ## Training and evaluation data
34
+
35
+ More information needed
36
+
37
+ ## Training procedure
38
+
39
+ ### Training hyperparameters
40
+
41
+ The following hyperparameters were used during training:
42
+ - optimizer: {'name': 'Adam', 'weight_decay': None, 'clipnorm': None, 'global_clipnorm': None, 'clipvalue': None, 'use_ema': False, 'ema_momentum': 0.99, 'ema_overwrite_frequency': None, 'jit_compile': True, 'is_legacy_optimizer': False, 'learning_rate': 1e-05, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False}
43
+ - training_precision: float32
44
+
45
+ ### Training results
46
+
47
+ | Train Loss | Train Accuracy | Validation Loss | Validation Accuracy | Epoch |
48
+ |:----------:|:--------------:|:---------------:|:-------------------:|:-----:|
49
+ | 0.6829 | 0.5808 | 0.6541 | 0.7639 | 0 |
50
+ | 0.6315 | 0.7784 | 0.5824 | 0.8472 | 1 |
51
+ | 0.4975 | 0.9222 | 0.4382 | 0.8889 | 2 |
52
+ | 0.3094 | 0.9521 | 0.3303 | 0.9028 | 3 |
53
+ | 0.1684 | 0.9820 | 0.2741 | 0.9028 | 4 |
54
+ | 0.0794 | 1.0 | 0.2381 | 0.9444 | 5 |
55
+
56
+
57
+ ### Framework versions
58
+
59
+ - Transformers 4.46.2
60
+ - TensorFlow 2.17.1
61
+ - Datasets 3.1.0
62
+ - Tokenizers 0.20.3
config.json CHANGED
@@ -1,23 +1,23 @@
1
- {
2
- "_name_or_path": "distilbert-base-uncased",
3
- "activation": "gelu",
4
- "architectures": [
5
- "DistilBertForSequenceClassification"
6
- ],
7
- "attention_dropout": 0.1,
8
- "dim": 768,
9
- "dropout": 0.1,
10
- "hidden_dim": 3072,
11
- "initializer_range": 0.02,
12
- "max_position_embeddings": 512,
13
- "model_type": "distilbert",
14
- "n_heads": 12,
15
- "n_layers": 6,
16
- "pad_token_id": 0,
17
- "qa_dropout": 0.1,
18
- "seq_classif_dropout": 0.2,
19
- "sinusoidal_pos_embds": false,
20
- "tie_weights_": true,
21
- "transformers_version": "4.43.3",
22
- "vocab_size": 30522
23
- }
 
1
+ {
2
+ "_name_or_path": "distilbert-base-uncased",
3
+ "activation": "gelu",
4
+ "architectures": [
5
+ "DistilBertForSequenceClassification"
6
+ ],
7
+ "attention_dropout": 0.1,
8
+ "dim": 768,
9
+ "dropout": 0.1,
10
+ "hidden_dim": 3072,
11
+ "initializer_range": 0.02,
12
+ "max_position_embeddings": 512,
13
+ "model_type": "distilbert",
14
+ "n_heads": 12,
15
+ "n_layers": 6,
16
+ "pad_token_id": 0,
17
+ "qa_dropout": 0.1,
18
+ "seq_classif_dropout": 0.2,
19
+ "sinusoidal_pos_embds": false,
20
+ "tie_weights_": true,
21
+ "transformers_version": "4.46.2",
22
+ "vocab_size": 30522
23
+ }
tf_model.h5 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:e959635f88788b251f8199fd474d4e4be2ebcc990b35d940fbd9ff86a8ddfd6b
3
  size 267955144
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:33e00f7f05008e32a7560fa27e954419556dca7d9e562e8ab999922e4d6b9df0
3
  size 267955144