agentlans commited on
Commit
90e3bb7
·
verified ·
1 Parent(s): a9a397d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -3
README.md CHANGED
@@ -1,5 +1,11 @@
1
  ---
2
  license: mit
 
 
 
 
 
 
3
  ---
4
  # Phi-3.5-mini-instruct-o1
5
 
@@ -12,7 +18,7 @@ Phi-3.5-mini-instruct-o1 is built upon the Phi-3.5-mini model, which is a lightw
12
  ## Features
13
 
14
  - **Enhanced Reasoning Process:** The model excels at providing clear and traceable reasoning paths, making it easier to follow its thought process and identify any potential mistakes.
15
- - **Improved Multistep Reasoning:** Fine-tuned with O1 data, the model demonstrates enhanced capabilities in multistep reasoning and overall accuracy.
16
  - **Specialized Capabilities:** Particularly well-suited for tasks involving math, coding, and logic, aligning with the strengths of the Phi-3.5 model family.
17
  - **Robust Performance:** Fine-tuned with high dropout rates to increase resilience and generalization capabilities.
18
 
@@ -27,7 +33,7 @@ Phi-3.5-mini-instruct-o1 is built upon the Phi-3.5-mini model, which is a lightw
27
  The fine-tuning process for Phi-3.5-mini-instruct-o1 employed the following techniques and parameters:
28
 
29
  - **Method:** Low-Rank Adaptation (LoRA) with 4-bit quantization via BitsAndBytes
30
- - **Dataset:** OpenO1-SFT
31
  - **Batch Size:** 1 with 8 gradient accumulation steps
32
  - **Learning Rate:** 5e-5
33
  - **Training Duration:** Single epoch, limited to 10,000 samples
@@ -52,4 +58,4 @@ Phi-3.5-mini-instruct-o1 is suitable for commercial and research applications th
52
 
53
  ## Ethical Considerations
54
 
55
- Users should be aware of potential biases in the model's outputs and exercise caution when deploying it in sensitive applications. Always verify the model's results, especially for critical decision-making processes.
 
1
  ---
2
  license: mit
3
+ datasets:
4
+ - O1-OPEN/OpenO1-SFT
5
+ language:
6
+ - en
7
+ base_model:
8
+ - microsoft/Phi-3.5-mini-instruct
9
  ---
10
  # Phi-3.5-mini-instruct-o1
11
 
 
18
  ## Features
19
 
20
  - **Enhanced Reasoning Process:** The model excels at providing clear and traceable reasoning paths, making it easier to follow its thought process and identify any potential mistakes.
21
+ - **Improved Multistep Reasoning:** Fine-tuned with O1 data, the model should have enhanced capabilities in multistep reasoning and overall accuracy.
22
  - **Specialized Capabilities:** Particularly well-suited for tasks involving math, coding, and logic, aligning with the strengths of the Phi-3.5 model family.
23
  - **Robust Performance:** Fine-tuned with high dropout rates to increase resilience and generalization capabilities.
24
 
 
33
  The fine-tuning process for Phi-3.5-mini-instruct-o1 employed the following techniques and parameters:
34
 
35
  - **Method:** Low-Rank Adaptation (LoRA) with 4-bit quantization via BitsAndBytes
36
+ - **Dataset:** [O1-OPEN/OpenO1-SFT](https://huggingface.co/datasets/O1-OPEN/OpenO1-SFT)
37
  - **Batch Size:** 1 with 8 gradient accumulation steps
38
  - **Learning Rate:** 5e-5
39
  - **Training Duration:** Single epoch, limited to 10,000 samples
 
58
 
59
  ## Ethical Considerations
60
 
61
+ Users should be aware of potential biases in the model's outputs and exercise caution when deploying it in sensitive applications. Always verify the model's results, especially for critical decision-making processes.