Update README.md
Browse files
README.md
CHANGED
@@ -1,5 +1,11 @@
|
|
1 |
---
|
2 |
license: mit
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
---
|
4 |
# Phi-3.5-mini-instruct-o1
|
5 |
|
@@ -12,7 +18,7 @@ Phi-3.5-mini-instruct-o1 is built upon the Phi-3.5-mini model, which is a lightw
|
|
12 |
## Features
|
13 |
|
14 |
- **Enhanced Reasoning Process:** The model excels at providing clear and traceable reasoning paths, making it easier to follow its thought process and identify any potential mistakes.
|
15 |
-
- **Improved Multistep Reasoning:** Fine-tuned with O1 data, the model
|
16 |
- **Specialized Capabilities:** Particularly well-suited for tasks involving math, coding, and logic, aligning with the strengths of the Phi-3.5 model family.
|
17 |
- **Robust Performance:** Fine-tuned with high dropout rates to increase resilience and generalization capabilities.
|
18 |
|
@@ -27,7 +33,7 @@ Phi-3.5-mini-instruct-o1 is built upon the Phi-3.5-mini model, which is a lightw
|
|
27 |
The fine-tuning process for Phi-3.5-mini-instruct-o1 employed the following techniques and parameters:
|
28 |
|
29 |
- **Method:** Low-Rank Adaptation (LoRA) with 4-bit quantization via BitsAndBytes
|
30 |
-
- **Dataset:** OpenO1-SFT
|
31 |
- **Batch Size:** 1 with 8 gradient accumulation steps
|
32 |
- **Learning Rate:** 5e-5
|
33 |
- **Training Duration:** Single epoch, limited to 10,000 samples
|
@@ -52,4 +58,4 @@ Phi-3.5-mini-instruct-o1 is suitable for commercial and research applications th
|
|
52 |
|
53 |
## Ethical Considerations
|
54 |
|
55 |
-
Users should be aware of potential biases in the model's outputs and exercise caution when deploying it in sensitive applications. Always verify the model's results, especially for critical decision-making processes.
|
|
|
1 |
---
|
2 |
license: mit
|
3 |
+
datasets:
|
4 |
+
- O1-OPEN/OpenO1-SFT
|
5 |
+
language:
|
6 |
+
- en
|
7 |
+
base_model:
|
8 |
+
- microsoft/Phi-3.5-mini-instruct
|
9 |
---
|
10 |
# Phi-3.5-mini-instruct-o1
|
11 |
|
|
|
18 |
## Features
|
19 |
|
20 |
- **Enhanced Reasoning Process:** The model excels at providing clear and traceable reasoning paths, making it easier to follow its thought process and identify any potential mistakes.
|
21 |
+
- **Improved Multistep Reasoning:** Fine-tuned with O1 data, the model should have enhanced capabilities in multistep reasoning and overall accuracy.
|
22 |
- **Specialized Capabilities:** Particularly well-suited for tasks involving math, coding, and logic, aligning with the strengths of the Phi-3.5 model family.
|
23 |
- **Robust Performance:** Fine-tuned with high dropout rates to increase resilience and generalization capabilities.
|
24 |
|
|
|
33 |
The fine-tuning process for Phi-3.5-mini-instruct-o1 employed the following techniques and parameters:
|
34 |
|
35 |
- **Method:** Low-Rank Adaptation (LoRA) with 4-bit quantization via BitsAndBytes
|
36 |
+
- **Dataset:** [O1-OPEN/OpenO1-SFT](https://huggingface.co/datasets/O1-OPEN/OpenO1-SFT)
|
37 |
- **Batch Size:** 1 with 8 gradient accumulation steps
|
38 |
- **Learning Rate:** 5e-5
|
39 |
- **Training Duration:** Single epoch, limited to 10,000 samples
|
|
|
58 |
|
59 |
## Ethical Considerations
|
60 |
|
61 |
+
Users should be aware of potential biases in the model's outputs and exercise caution when deploying it in sensitive applications. Always verify the model's results, especially for critical decision-making processes.
|