File size: 3,468 Bytes
fe11eec 350ed4e fe11eec 1c43867 fe11eec 350ed4e 387a94f 350ed4e 5ce08c1 1c43867 350ed4e 1c43867 8b0f6fe 5ce08c1 f65603a 32ea1fa 350ed4e 1c43867 350ed4e 1c43867 370c8ff 350ed4e e2f3929 350ed4e 0e81072 58fed7c 350ed4e 1c43867 0e81072 9be41b5 1c43867 350ed4e 0c2e721 5ce08c1 1c43867 350ed4e 5ce08c1 350ed4e 5ce08c1 1c43867 5ce08c1 1c43867 5ce08c1 1c43867 350ed4e d154987 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 |
---
datasets:
- jondurbin/gutenberg-dpo-v0.1
- Qwen/Qwen2.5-14B-Instruct
- HuggingFaceH4/ultrafeedback_binarized
base_model:
- Qwen/Qwen2.5-14B-Instruct
- v000000/Qwen2.5-14B-Gutenberg-1e-Delta
- tanliboy/lambda-qwen2.5-14b-dpo-test
library_name: transformers
tags:
- qwen
- qwen2.5
- finetune
- dpo
- orpo
- qwen2
- chat
- conversational
- instruct
- storywriting
- roleplay
license: apache-2.0
language:
- en
pipeline_tag: text-generation
---
# Qwen2.5-Lumen-14B
* *Qwen direct preference optimization finetuned for ~3 epochs.*
![image/png](https://cdn-uploads.huggingface.co/production/uploads/64f74b6e6389380c77562762/wCcJkdrVDUH6m0AN9Lv3B.png)
<b>A qwen2.5 preference finetune, targeting prompt adherence, storywriting and roleplay.</b>
-------------------------------------------------------------------------------
## Training Notes
Trained [Qwen2.5-14B-Instruct](https://huggingface.co/Qwen/Qwen2.5-14B-Instruct) for 2 epochs on NVidia A100, and on dataset [jondurbin/gutenberg-dpo-v0.1](https://huggingface.co/datasets/jondurbin/gutenberg-dpo-v0.1), saving different checkpoints along the way.
[Tanliboy](https://huggingface.co/tanliboy) trained [Qwen2.5-14B-Instruct](https://huggingface.co/Qwen/Qwen2.5-14B-Instruct) for 1 epoch on [HuggingFaceH4/ultrafeedback_binarized](HuggingFaceH4/ultrafeedback_binarized), (Credit to Tanliboy! *Check out his model [here](https://huggingface.co/tanliboy/lambda-qwen2.5-14b-dpo-test)*)
*Mass checkpoint merged, Based on Qwen2.5-14B-Instruct (Base Model).*
## Merge
* Merged with a sophosympatheia's <b>SLERP</b> gradient *"Ultrafeedback-Binarized DPO"* and *"Gutenberg DPO"*
* Merged with a sophosympatheia's <b>SLERP</b> gradient *"Qwen2.5-14B-Instruct"* and *"Gutenberg DPO"*
* Merged all <b>DPO checkpoints</b> and <b>SLERP</b> variations with <b>MODEL_STOCK</b> to analyze geometric properties and get the most performant aspects of all runs/merges. Model Stock was chosen due to the similarity between the merged models.
## Recipe
```yaml
models:
- model: v000000/Qwen2.5-14B-Gutenberg-1e-Delta
- model: v000000/Qwen2.5-14B-Gutenberg-0.6e-Sequential
- model: v000000/Qwen2.5-14B-Gutenberg-0.25e-Early
- model: v000000/Qwen2.5-14B-Gutenberg-2e-Sequential
- model: v000000/Qwen2.5-14B-Gutenberg-0.37e-Early
- model: v000000/Qwen2.5-14B-Gutenberg-2e-Zeta
- model: v000000/Qwen2.5-14B-Gutenberg-1e-Theta
- model: tanliboy/lambda-qwen2.5-14b-dpo-test
- model: v000000/Qwen2.5-14B-Gutenberg-1e-Delta
- model: tanliboy/lambda-qwen2.5-14b-dpo-test
- model: v000000/Qwen2.5-14B-Gutenberg-UltraLambda-Slerpeno
- model: v000000/Qwen2.5-14B-Gutenberg-Instruct-Slerpeno
base_model: v000000/Qwen2.5-14B-Gutenberg-1e-Delta
merge_method: model_stock
dtype: bfloat16
```
### Finetune and merge
This is a merge and finetune of pre-trained language models.
### Models Merged
[Arxiv 2403.19522](https://arxiv.org/abs/2403.19522)
The following models were included in the merge:
* v000000/Qwen2.5-14B-Gutenberg-1e-Delta
* v000000/Qwen2.5-14B-Gutenberg-0.6e-Sequential
* v000000/Qwen2.5-14B-Gutenberg-0.25e-Early
* v000000/Qwen2.5-14B-Gutenberg-2e-Sequential
* v000000/Qwen2.5-14B-Gutenberg-0.37e-Early
* v000000/Qwen2.5-14B-Gutenberg-2e-Zeta
* v000000/Qwen2.5-14B-Gutenberg-1e-Theta
* v000000/Qwen2.5-14B-Gutenberg-UltraLambda-Slerpeno
* v000000/Qwen2.5-14B-Gutenberg-Instruct-Slerpeno
* tanliboy/lambda-qwen2.5-14b-dpo-test
- Context Length: Full 131,072 tokens and generation 8192 tokens |