jermyn
/

deepseek-code-1.3b-inst-NLQ2Cypher

@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
 [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
 <details><summary>See axolotl config</summary>
-axolotl version: `0.4.0`
 ```yaml
 base_model: deepseek-ai/deepseek-coder-1.3b-instruct
 # base_model: Qwen/CodeQwen1.5-7B-Chat
@@ -51,7 +51,7 @@ sequence_len: 896
 sample_packing: false
 pad_to_sequence_len: true
-lora_r: 32
 lora_alpha: 16
 lora_dropout: 0.05
 lora_target_linear: true
@@ -129,7 +129,7 @@ save_safetensors: true
 This model is a fine-tuned version of [deepseek-ai/deepseek-coder-1.3b-instruct](https://huggingface.co/deepseek-ai/deepseek-coder-1.3b-instruct) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.3714
 ## Model description
@@ -162,33 +162,33 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
 | 1.8723        | 0.1429 | 1    | 1.6354          |
-| 1.9222        | 0.2857 | 2    | 1.6236          |
-| 1.7038        | 0.5714 | 4    | 1.4355          |
-| 1.2833        | 0.8571 | 6    | 0.9288          |
-| 0.6389        | 1.1429 | 8    | 0.7050          |
-| 0.401         | 1.4286 | 10   | 0.5980          |
-| 0.2982        | 1.7143 | 12   | 0.5694          |
-| 0.3225        | 2.0    | 14   | 0.5651          |
-| 0.2214        | 2.2857 | 16   | 0.5221          |
-| 0.1375        | 2.5714 | 18   | 0.4537          |
-| 0.1058        | 2.8571 | 20   | 0.3971          |
-| 0.0945        | 3.1429 | 22   | 0.3698          |
-| 0.1352        | 3.4286 | 24   | 0.3518          |
-| 0.0688        | 3.7143 | 26   | 0.3420          |
-| 0.0677        | 4.0    | 28   | 0.3508          |
-| 0.0506        | 4.2857 | 30   | 0.3577          |
-| 0.1056        | 4.5714 | 32   | 0.3714          |
-| 0.0839        | 4.8571 | 34   | 0.3710          |
-| 0.0562        | 5.1429 | 36   | 0.3717          |
-| 0.0715        | 5.4286 | 38   | 0.3749          |
-| 0.0708        | 5.7143 | 40   | 0.3709          |
-| 0.07          | 6.0    | 42   | 0.3714          |
 ### Framework versions
-- PEFT 0.10.0
-- Transformers 4.40.2
 - Pytorch 2.1.2+cu118
 - Datasets 2.19.1
 - Tokenizers 0.19.1

 [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
 <details><summary>See axolotl config</summary>
+axolotl version: `0.4.1`
 ```yaml
 base_model: deepseek-ai/deepseek-coder-1.3b-instruct
 # base_model: Qwen/CodeQwen1.5-7B-Chat
 sample_packing: false
 pad_to_sequence_len: true
+lora_r: 16
 lora_alpha: 16
 lora_dropout: 0.05
 lora_target_linear: true
 This model is a fine-tuned version of [deepseek-ai/deepseek-coder-1.3b-instruct](https://huggingface.co/deepseek-ai/deepseek-coder-1.3b-instruct) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.3839
 ## Model description
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
 | 1.8723        | 0.1429 | 1    | 1.6354          |
+| 1.9222        | 0.2857 | 2    | 1.6215          |
+| 1.6971        | 0.5714 | 4    | 1.4205          |
+| 1.2458        | 0.8571 | 6    | 0.9204          |
+| 0.6179        | 1.1429 | 8    | 0.6923          |
+| 0.366         | 1.4286 | 10   | 0.5647          |
+| 0.2752        | 1.7143 | 12   | 0.5225          |
+| 0.2931        | 2.0    | 14   | 0.5167          |
+| 0.1812        | 2.2857 | 16   | 0.4564          |
+| 0.1258        | 2.5714 | 18   | 0.4038          |
+| 0.0885        | 2.8571 | 20   | 0.3689          |
+| 0.0886        | 3.1429 | 22   | 0.3647          |
+| 0.1281        | 3.4286 | 24   | 0.3503          |
+| 0.0606        | 3.7143 | 26   | 0.3458          |
+| 0.0603        | 4.0    | 28   | 0.3635          |
+| 0.0479        | 4.2857 | 30   | 0.3724          |
+| 0.0963        | 4.5714 | 32   | 0.3827          |
+| 0.0725        | 4.8571 | 34   | 0.3868          |
+| 0.049         | 5.1429 | 36   | 0.3873          |
+| 0.0572        | 5.4286 | 38   | 0.3860          |
+| 0.061         | 5.7143 | 40   | 0.3890          |
+| 0.0702        | 6.0    | 42   | 0.3839          |
 ### Framework versions
+- PEFT 0.11.1
+- Transformers 4.41.1
 - Pytorch 2.1.2+cu118
 - Datasets 2.19.1
 - Tokenizers 0.19.1