agentlans
/

Phi-3.5-mini-instruct-o1

Model card Files Files and versions Community

agentlans commited on 22 days ago

Commit

90e3bb7

·

verified ·

1 Parent(s): a9a397d

Update README.md

Files changed (1) hide show

README.md +9 -3

README.md CHANGED Viewed

@@ -1,5 +1,11 @@
 ---
 license: mit
 ---
 # Phi-3.5-mini-instruct-o1
@@ -12,7 +18,7 @@ Phi-3.5-mini-instruct-o1 is built upon the Phi-3.5-mini model, which is a lightw
 ## Features
 - **Enhanced Reasoning Process:** The model excels at providing clear and traceable reasoning paths, making it easier to follow its thought process and identify any potential mistakes.
-- **Improved Multistep Reasoning:** Fine-tuned with O1 data, the model demonstrates enhanced capabilities in multistep reasoning and overall accuracy.
 - **Specialized Capabilities:** Particularly well-suited for tasks involving math, coding, and logic, aligning with the strengths of the Phi-3.5 model family.
 - **Robust Performance:** Fine-tuned with high dropout rates to increase resilience and generalization capabilities.
@@ -27,7 +33,7 @@ Phi-3.5-mini-instruct-o1 is built upon the Phi-3.5-mini model, which is a lightw
 The fine-tuning process for Phi-3.5-mini-instruct-o1 employed the following techniques and parameters:
 - **Method:** Low-Rank Adaptation (LoRA) with 4-bit quantization via BitsAndBytes
-- **Dataset:** OpenO1-SFT
 - **Batch Size:** 1 with 8 gradient accumulation steps
 - **Learning Rate:** 5e-5
 - **Training Duration:** Single epoch, limited to 10,000 samples
@@ -52,4 +58,4 @@ Phi-3.5-mini-instruct-o1 is suitable for commercial and research applications th
 ## Ethical Considerations
-Users should be aware of potential biases in the model's outputs and exercise caution when deploying it in sensitive applications. Always verify the model's results, especially for critical decision-making processes.

 ---
 license: mit
+datasets:
+- O1-OPEN/OpenO1-SFT
+language:
+- en
+base_model:
+- microsoft/Phi-3.5-mini-instruct
 ---
 # Phi-3.5-mini-instruct-o1
 ## Features
 - **Enhanced Reasoning Process:** The model excels at providing clear and traceable reasoning paths, making it easier to follow its thought process and identify any potential mistakes.
+- **Improved Multistep Reasoning:** Fine-tuned with O1 data, the model should have enhanced capabilities in multistep reasoning and overall accuracy.
 - **Specialized Capabilities:** Particularly well-suited for tasks involving math, coding, and logic, aligning with the strengths of the Phi-3.5 model family.
 - **Robust Performance:** Fine-tuned with high dropout rates to increase resilience and generalization capabilities.
 The fine-tuning process for Phi-3.5-mini-instruct-o1 employed the following techniques and parameters:
 - **Method:** Low-Rank Adaptation (LoRA) with 4-bit quantization via BitsAndBytes
+- **Dataset:** [O1-OPEN/OpenO1-SFT](https://huggingface.co/datasets/O1-OPEN/OpenO1-SFT)
 - **Batch Size:** 1 with 8 gradient accumulation steps
 - **Learning Rate:** 5e-5
 - **Training Duration:** Single epoch, limited to 10,000 samples
 ## Ethical Considerations
+Users should be aware of potential biases in the model's outputs and exercise caution when deploying it in sensitive applications. Always verify the model's results, especially for critical decision-making processes.