dariog commited on
Commit
622ad42
verified
1 Parent(s): fc73a0b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -225,7 +225,7 @@ print(tokenizer.decode(response, skip_special_tokens=True))
225
  ## Training Details
226
 
227
  ### Supervised fine-tuning
228
- SFT on top of Qwen2.5-7B using axolotl (https://github.com/axolotl-ai-cloud/axolotl).
229
 
230
  We used Deepspeed's Zero-3 distributed training using the following hardware:
231
 
@@ -276,7 +276,7 @@ The training set consists of around 1.8B tokens, having 3 different types of dat
276
  - Gradient accumulation steps: 4
277
 
278
  ### Model Merging
279
- The model trained was merged with the Qwen2.5-7B-Instruct model using the DARE_TIES technique. [Mergekit](https://github.com/arcee-ai/mergekit) was used to conduct the merging.
280
 
281
  ### Model Alignment
282
  The model is aligned using the Direct Preference Optimization (DPO) technique through a two-step process:
 
225
  ## Training Details
226
 
227
  ### Supervised fine-tuning
228
+ SFT on top of Qwen2.5-72B using axolotl (https://github.com/axolotl-ai-cloud/axolotl).
229
 
230
  We used Deepspeed's Zero-3 distributed training using the following hardware:
231
 
 
276
  - Gradient accumulation steps: 4
277
 
278
  ### Model Merging
279
+ The model trained was merged with the Qwen2.5-72-Instruct model using the DARE_TIES technique. [Mergekit](https://github.com/arcee-ai/mergekit) was used to conduct the merging.
280
 
281
  ### Model Alignment
282
  The model is aligned using the Direct Preference Optimization (DPO) technique through a two-step process: