Update README.md
Browse files
README.md
CHANGED
@@ -12,13 +12,16 @@ library_name: transformers
|
|
12 |
tags:
|
13 |
- mergekit
|
14 |
- merge
|
15 |
-
|
16 |
---
|
17 |
-
#
|
|
|
|
|
|
|
|
|
18 |
|
|
|
19 |
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
|
20 |
|
21 |
-
## Merge Details
|
22 |
### Merge Method
|
23 |
|
24 |
This model was merged using the [Model Stock](https://arxiv.org/abs/2403.19522) merge method using [unsloth/phi-4](https://huggingface.co/unsloth/phi-4) as a base.
|
|
|
12 |
tags:
|
13 |
- mergekit
|
14 |
- merge
|
|
|
15 |
---
|
16 |
+
# **Phi-4-Super-o1**
|
17 |
+
|
18 |
+
[Phi-4-Super-o1 finetuned] from Microsoft's Phi-4 is a state-of-the-art open model developed with a focus on responsible problem solving and advanced reasoning capabilities. Built upon a diverse blend of synthetic datasets, carefully filtered public domain websites, and high-quality academic books and Q&A datasets, Phi-4-Super-o1 ensures that small, capable models are trained with datasets of exceptional depth and precision.
|
19 |
+
|
20 |
+
Phi-4-Super-o1 adopts a robust safety post-training approach using open-source and in-house synthetic datasets. This involves a combination of SFT (Supervised Fine-Tuning) and iterative DPO (Direct Preference Optimization) techniques, ensuring helpful and harmless outputs across various safety categories.
|
21 |
|
22 |
+
Let me know if you’d like further edits or refinements!
|
23 |
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
|
24 |
|
|
|
25 |
### Merge Method
|
26 |
|
27 |
This model was merged using the [Model Stock](https://arxiv.org/abs/2403.19522) merge method using [unsloth/phi-4](https://huggingface.co/unsloth/phi-4) as a base.
|