Nondzu commited on
Commit
2651e8e
·
1 Parent(s): e0c2f8a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +58 -3
README.md CHANGED
@@ -1,9 +1,64 @@
1
  ---
2
  license: cc-by-nc-nd-4.0
3
  ---
4
- Mistral & code dataset finetuning qlora
5
 
6
- dataset: ajibawa-2023/Code-74k-ShareGPT
7
 
8
- EvalPlus: 0.421
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/63729f35acef705233c87909/VLtuWPh8m07bgU8BElpNv.png)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: cc-by-nc-nd-4.0
3
  ---
 
4
 
5
+ # Mistral-7B-Instruct-v0.2-code-ft
6
 
7
+ I'm thrilled to introduce the latest iteration of our model, Mistral-7B-Instruct-v0.2-code-ft. This updated version is designed to further enhance coding assistance and co-pilot functionalities. We're eager for developers and enthusiasts to try it out and provide feedback!
8
+
9
+ ## Additional Information
10
+
11
+ This version builds upon the previous Mistral-7B models, incorporating new datasets and features for a more refined experience.
12
+
13
+ ## Prompt template: ChatML
14
+ ```
15
+ <|im_start|>system
16
+ {system_message}<|im_end|>
17
+ <|im_start|>user
18
+ {prompt}<|im_end|>
19
+ <|im_start|>assistant
20
+ ```
21
+
22
+ ## Eval Plus Performance
23
+
24
+ For detailed performance metrics, visit Eval Plus page: [Mistral-7B-Instruct-v0.2-code-ft Eval Plus](https://github.com/evalplus/evalplus)
25
+
26
+ Score: 0.421
27
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/63729f35acef705233c87909/VLtuWPh8m07bgU8BElpNv.png)
28
+
29
+ ## Dataset:
30
+ The model has been trained on a new dataset to improve its performance and versatility:
31
+ - path: ajibawa-2023/Code-74k-ShareGPT
32
+
33
+ type: sharegpt
34
+
35
+ conversation: chatml
36
+
37
+ Find more about the dataset here: [Code-74k-ShareGPT Dataset](https://huggingface.co/datasets/ajibawa-2023/Code-74k-ShareGPT)
38
+
39
+ ## Model Architecture
40
+
41
+ - Base Model: mistralai/Mistral-7B-Instruct-v0.2
42
+ - Tokenizer Type: LlamaTokenizer
43
+ - Model Type: MistralForCausalLM
44
+ - Is Mistral Derived Model: true
45
+ - Sequence Length: 16384 with sample packing
46
+
47
+ ## Enhanced Features
48
+
49
+ - Adapter: qlora
50
+ - Learning Rate: 0.0002 with cosine lr scheduler
51
+ - Optimizer: paged_adamw_32bit
52
+ - Training Enhancements: bf16 training, gradient checkpointing, and flash attention
53
+
54
+
55
+ ## Download Information
56
+
57
+ You can download and explore this model through these links on Hugging Face.
58
+
59
+ ## Contributions and Feedback
60
+
61
+ We welcome contributions and feedback from the community. Please feel free to open issues or pull requests on repository.
62
+
63
+ [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
64
+