metadata
license: apache-2.0
datasets:
- PrimeIntellect/SYNTHETIC-1-SFT-Data
base_model:
- Qwen/Qwen2.5-7B-Instruct
SYNTHETIC-1-7B-SFT
SYNTHETIC-1-7B-SFT is an initial model trained on the SFT subset of SYNTHETIC-1, a collaboratively generated reasoning dataset from Deepseek-R1. The model largely outperforms other models based on Qwen-2.5-Instruct-7B that were trained with smaller reasoning datasets.
All SYNTHETIC-1 datasets can be found in our 🤗 SYNTHETIC-1 Collection.
Citation
Feel free to cite SYNTHETIC-1 if you have found it useful for your work
@misc{2025synthetic1,
title={SYNTHETIC-1: Two Million Collaboratively Generated Reasoning Traces from Deepseek-R1},
author={Justus Mattern and Sami Jaghouar and Manveer Basra and Jannik Straube and Matthew Di Ferrante and Felix Gabriel and Jack Min Ong and Vincent Weisser and Johannes Hagemann},
year={2025},
url={https://www.primeintellect.ai/blog/synthetic-1-release},
}