Text Generation
Safetensors
English
llama
samsja commited on
Commit
47f9f0f
·
verified ·
1 Parent(s): 1c8bf88

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -82,7 +82,7 @@ The post-training has been handle by [arcee](https://huggingface.co/arcee-ai)
82
 
83
  We applied several post-training techniques to enhance INTELLECT-1's capabilities and task-specific performance. Our post-training methodology consisted of three main phases.
84
 
85
- First, we conducted an extensive series of 16 Supervised Fine-Tuning (SFT) trainings, with individual runs ranging from 1 to 3.3 billion tokens each. The most successful configuration used 2.4 billion training tokens over 3 epochs. We used MergeKit, EvolKit, and DistillKit from Arcee AI to combine the models, generate the data sets, and distill the logits, respectively. For training data, we used a diverse set of high-quality datasets:
86
 
87
  ## Post-training
88
 
 
82
 
83
  We applied several post-training techniques to enhance INTELLECT-1's capabilities and task-specific performance. Our post-training methodology consisted of three main phases.
84
 
85
+ First, we conducted an extensive series of 16 Supervised Fine-Tuning (SFT) trainings, with individual runs ranging from 1 to 3.3 billion tokens each. The most successful configuration used 2.4 billion training tokens over 3 epochs. We used [MergeKit](https://github.com/arcee-ai/mergekit), [EvolKit](https://github.com/arcee-ai/EvolKit), and [DistillKit](https://github.com/arcee-ai/DistillKit) from Arcee AI to combine the models, generate the data sets, and distill the logits, respectively. For training data, we used a diverse set of high-quality datasets:
86
 
87
  ## Post-training
88