metadata

license: apache-2.0
datasets:
  - PrimeIntellect/SYNTHETIC-1-SFT-Data
base_model:
  - Qwen/Qwen2.5-7B-Instruct

SYNTHETIC-1-7B-SFT

SYNTHETIC-1-7B-SFT is an initial model trained on the SFT subset of SYNTHETIC-1, a collaboratively generated reasoning dataset from Deepseek-R1. The model largely outperforms other models based on Qwen-2.5-Instruct-7B that were trained with smaller reasoning datasets.

All SYNTHETIC-1 datasets can be found in our 🤗 SYNTHETIC-1 Collection.

Citation

Feel free to cite SYNTHETIC-1 if you have found it useful for your work

@misc{2025synthetic1,
      title={SYNTHETIC-1: Two Million Collaboratively Generated Reasoning Traces from Deepseek-R1}, 
      author={Justus Mattern and Sami Jaghouar and Manveer Basra and Jannik Straube and Matthew Di Ferrante and Felix Gabriel and Jack Min Ong and Vincent Weisser and Johannes Hagemann},
      year={2025},
      url={https://www.primeintellect.ai/blog/synthetic-1-release}, 
}