ctrltokyo commited on
Commit
b817fdf
·
1 Parent(s): 85b8a5f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -5
README.md CHANGED
@@ -6,6 +6,10 @@ tags:
6
  model-index:
7
  - name: ctrltokyo/llm_prompt_mask_fill_model
8
  results: []
 
 
 
 
9
  ---
10
 
11
  <!-- This model card has been generated automatically according to the information Keras had access to. You should
@@ -13,7 +17,7 @@ probably proofread and complete it, then remove this comment. -->
13
 
14
  # ctrltokyo/llm_prompt_mask_fill_model
15
 
16
- This model is a fine-tuned version of [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased) on an unknown dataset.
17
  It achieves the following results on the evaluation set:
18
  - Train Loss: 2.1215
19
  - Validation Loss: 1.5672
@@ -21,15 +25,15 @@ It achieves the following results on the evaluation set:
21
 
22
  ## Model description
23
 
24
- More information needed
25
 
26
  ## Intended uses & limitations
27
 
28
- More information needed
29
 
30
  ## Training and evaluation data
31
 
32
- More information needed
33
 
34
  ## Training procedure
35
 
@@ -51,4 +55,4 @@ The following hyperparameters were used during training:
51
  - Transformers 4.31.0
52
  - TensorFlow 2.12.0
53
  - Datasets 2.14.1
54
- - Tokenizers 0.13.3
 
6
  model-index:
7
  - name: ctrltokyo/llm_prompt_mask_fill_model
8
  results: []
9
+ datasets:
10
+ - sahil2801/code_instructions_120k
11
+ metrics:
12
+ - accuracy
13
  ---
14
 
15
  <!-- This model card has been generated automatically according to the information Keras had access to. You should
 
17
 
18
  # ctrltokyo/llm_prompt_mask_fill_model
19
 
20
+ This model is a fine-tuned version of [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased) on the [code_instructions_120k](https://huggingface.co/datasets/sahil2801/code_instructions_120k) dataset.
21
  It achieves the following results on the evaluation set:
22
  - Train Loss: 2.1215
23
  - Validation Loss: 1.5672
 
25
 
26
  ## Model description
27
 
28
+ It's just distilbert-base-uncased with some fine tuning.
29
 
30
  ## Intended uses & limitations
31
 
32
+ This model could be used for live autocompletion in a coding-specific chatbot.
33
 
34
  ## Training and evaluation data
35
 
36
+ Evaluated on 5% of training data. No further evaluation performed at this point. Trained on NVIDIA V100.
37
 
38
  ## Training procedure
39
 
 
55
  - Transformers 4.31.0
56
  - TensorFlow 2.12.0
57
  - Datasets 2.14.1
58
+ - Tokenizers 0.13.3