codegood commited on
Commit
1ea7313
1 Parent(s): 08eae84

codegood/Mistral_instruct_QA/

Browse files
README.md CHANGED
@@ -15,7 +15,12 @@ should probably proofread and complete it, then remove this comment. -->
15
 
16
  This model is a fine-tuned version of [bn22/Mistral-7B-Instruct-v0.1-sharded](https://huggingface.co/bn22/Mistral-7B-Instruct-v0.1-sharded) on the None dataset.
17
  It achieves the following results on the evaluation set:
18
- - Loss: 1.9447
 
 
 
 
 
19
 
20
  ## Model description
21
 
@@ -41,19 +46,7 @@ The following hyperparameters were used during training:
41
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
42
  - lr_scheduler_type: cosine
43
  - lr_scheduler_warmup_ratio: 0.05
44
- - training_steps: 600
45
-
46
- ### Training results
47
-
48
- | Training Loss | Epoch | Step | Validation Loss |
49
- |:-------------:|:-----:|:----:|:---------------:|
50
- | 2.4418 | 0.07 | 100 | 2.2922 |
51
- | 2.1884 | 0.15 | 200 | 2.1105 |
52
- | 2.0871 | 0.22 | 300 | 2.0371 |
53
- | 2.011 | 0.29 | 400 | 1.9835 |
54
- | 1.9346 | 0.37 | 500 | 1.9517 |
55
- | 1.9721 | 0.44 | 600 | 1.9447 |
56
-
57
 
58
  ### Framework versions
59
 
 
15
 
16
  This model is a fine-tuned version of [bn22/Mistral-7B-Instruct-v0.1-sharded](https://huggingface.co/bn22/Mistral-7B-Instruct-v0.1-sharded) on the None dataset.
17
  It achieves the following results on the evaluation set:
18
+ - eval_loss: 0.5257
19
+ - eval_runtime: 448.2658
20
+ - eval_samples_per_second: 2.231
21
+ - eval_steps_per_second: 0.558
22
+ - epoch: 2.32
23
+ - step: 3000
24
 
25
  ## Model description
26
 
 
46
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
47
  - lr_scheduler_type: cosine
48
  - lr_scheduler_warmup_ratio: 0.05
49
+ - num_epochs: 3.0
 
 
 
 
 
 
 
 
 
 
 
 
50
 
51
  ### Framework versions
52
 
adapter_config.json CHANGED
@@ -19,11 +19,11 @@
19
  "rank_pattern": {},
20
  "revision": null,
21
  "target_modules": [
22
- "k_proj",
23
  "q_proj",
24
- "o_proj",
25
  "v_proj",
26
- "gate_proj"
27
  ],
28
  "task_type": "CAUSAL_LM"
29
  }
 
19
  "rank_pattern": {},
20
  "revision": null,
21
  "target_modules": [
22
+ "gate_proj",
23
  "q_proj",
24
+ "k_proj",
25
  "v_proj",
26
+ "o_proj"
27
  ],
28
  "task_type": "CAUSAL_LM"
29
  }
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:61965e4bd4a997696781f46af81b9860bd3b6f9c9dbc2d790a3a26efdf4cc4ba
3
  size 369142184
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d2e427316d0fff899f219be68978450b7eb2aaddbea4e631acf123d4c0efb22e
3
  size 369142184
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:1c78c6d371e675608a6248617f3f15cef6616718c149eb7f6252e5188c29580a
3
  size 4091
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:34bfb87bcd63286e412b8c4a791e932f8a87efc6d0850f0af666080c87ba8468
3
  size 4091