reddgr
/

tl-test-learn-prompt-classifier

Text Classification

Transformers

TensorFlow

distilbert

generated_from_keras_callback

Model card Files Files and versions Community

reddgr commited on Nov 21, 2024

Commit

d7517c7

verified ·

1 Parent(s): 4011356

Upload TFDistilBertForSequenceClassification

Browse files

Files changed (3) hide show

README.md +57 -7
config.json +23 -23
tf_model.h5 +1 -1

README.md CHANGED Viewed

@@ -1,12 +1,62 @@
 ---
 license: apache-2.0
-base_model:
-- distilbert/distilbert-base-uncased
 tags:
-- text-classification
-- chatbot-prompts
 ---
-This is a fine-tuning of [distilbert-base-uncased](https://huggingface.co/distilbert/distilbert-base-uncased) trained on a small set of manually labeled sentences classified as "instruction" or "problem".
-The main purpose is to calculate metrics used by the SCBN-RQTL chatbot response evaluation benchmark. The acronym TL stands for Test vs. Learn and is based on the assumption that prompts labeled as 'instruction' typically reflect the user's intent to 'learn,' while prompts labeled as 'problem' are generally aimed at 'testing' the chatbot by challenging it.
-More information in the GitHub repository [here](https://github.com/reddgr/chatbot-response-scoring-scbn-rqtl)

 ---
+library_name: transformers
 license: apache-2.0
+base_model: distilbert-base-uncased
 tags:
+- generated_from_keras_callback
+model-index:
+- name: tl-test-learn-prompt-classifier
+  results: []
 ---
+<!-- This model card has been generated automatically according to the information Keras had access to. You should
+probably proofread and complete it, then remove this comment. -->
+# tl-test-learn-prompt-classifier
+This model is a fine-tuned version of [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Train Loss: 0.0794
+- Train Accuracy: 1.0
+- Validation Loss: 0.2381
+- Validation Accuracy: 0.9444
+- Epoch: 5
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- optimizer: {'name': 'Adam', 'weight_decay': None, 'clipnorm': None, 'global_clipnorm': None, 'clipvalue': None, 'use_ema': False, 'ema_momentum': 0.99, 'ema_overwrite_frequency': None, 'jit_compile': True, 'is_legacy_optimizer': False, 'learning_rate': 1e-05, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False}
+- training_precision: float32
+### Training results
+| Train Loss | Train Accuracy | Validation Loss | Validation Accuracy | Epoch |
+|:----------:|:--------------:|:---------------:|:-------------------:|:-----:|
+| 0.6829     | 0.5808         | 0.6541          | 0.7639              | 0     |
+| 0.6315     | 0.7784         | 0.5824          | 0.8472              | 1     |
+| 0.4975     | 0.9222         | 0.4382          | 0.8889              | 2     |
+| 0.3094     | 0.9521         | 0.3303          | 0.9028              | 3     |
+| 0.1684     | 0.9820         | 0.2741          | 0.9028              | 4     |
+| 0.0794     | 1.0            | 0.2381          | 0.9444              | 5     |
+### Framework versions
+- Transformers 4.46.2
+- TensorFlow 2.17.1
+- Datasets 3.1.0
+- Tokenizers 0.20.3

config.json CHANGED Viewed

@@ -1,23 +1,23 @@
-{
-  "_name_or_path": "distilbert-base-uncased",
-  "activation": "gelu",
-  "architectures": [
-    "DistilBertForSequenceClassification"
-  ],
-  "attention_dropout": 0.1,
-  "dim": 768,
-  "dropout": 0.1,
-  "hidden_dim": 3072,
-  "initializer_range": 0.02,
-  "max_position_embeddings": 512,
-  "model_type": "distilbert",
-  "n_heads": 12,
-  "n_layers": 6,
-  "pad_token_id": 0,
-  "qa_dropout": 0.1,
-  "seq_classif_dropout": 0.2,
-  "sinusoidal_pos_embds": false,
-  "tie_weights_": true,
-  "transformers_version": "4.43.3",
-  "vocab_size": 30522
-}

+{
+  "_name_or_path": "distilbert-base-uncased",
+  "activation": "gelu",
+  "architectures": [
+    "DistilBertForSequenceClassification"
+  ],
+  "attention_dropout": 0.1,
+  "dim": 768,
+  "dropout": 0.1,
+  "hidden_dim": 3072,
+  "initializer_range": 0.02,
+  "max_position_embeddings": 512,
+  "model_type": "distilbert",
+  "n_heads": 12,
+  "n_layers": 6,
+  "pad_token_id": 0,
+  "qa_dropout": 0.1,
+  "seq_classif_dropout": 0.2,
+  "sinusoidal_pos_embds": false,
+  "tie_weights_": true,
+  "transformers_version": "4.46.2",
+  "vocab_size": 30522
+}

tf_model.h5 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:e959635f88788b251f8199fd474d4e4be2ebcc990b35d940fbd9ff86a8ddfd6b
 size 267955144

 version https://git-lfs.github.com/spec/v1
+oid sha256:33e00f7f05008e32a7560fa27e954419556dca7d9e562e8ab999922e4d6b9df0
 size 267955144