File size: 2,150 Bytes
dce6c7c 7509c21 dce6c7c 7509c21 dce6c7c 7509c21 dce6c7c 07e6160 7509c21 dce6c7c 7509c21 dce6c7c 7509c21 dce6c7c 7509c21 dce6c7c 7509c21 dce6c7c 7509c21 dce6c7c 7509c21 dce6c7c 7509c21 dce6c7c 7509c21 dce6c7c 7509c21 dce6c7c 7509c21 dce6c7c 7509c21 dce6c7c 7509c21 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 |
---
license: cc-by-nc-4.0
base_model: PhigRange-2.7B-slerp
tags:
- generated_from_trainer
- DPO
- instruct
- finetune
- chatml
- gpt4
- synthetic data
- distillation
model-index:
- name: PhigRange-DPO
results: []
datasets:
- mlabonne/chatml-OpenHermes2.5-dpo-binarized-alpha
language:
- en
library_name: transformers
pipeline_tag: text-generation
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# PhigRange-DPO
![image/png](https://cdn-uploads.huggingface.co/production/uploads/660cfe98280a82e38fe4ef49/1aDHvNk5pebHacGnzaHv9.png)
PhigRange-DPO is a DPO fine-tuned of [johnsnowlabs/PhigRange-2.7B-Slerp](https://huggingface.co/mlabonne/NeuralMonarch-7B/) using the [mlabonne/chatml-OpenHermes2.5-dpo-binarized-alpha](https://huggingface.co/datasets/mlabonne/chatml-OpenHermes2.5-dpo-binarized-alpha) preference dataset. The model has been trained for for 1080 steps.
## 🏆 Evaluation results
### Coming Soon
## 💻 Usage
```python
!pip install -qU transformers accelerate
from transformers import AutoTokenizer
import transformers
import torch
model = "johnsnowlabs/PhigRange-DPO"
messages = [{"role": "user", "content": "What is a large language model?"}]
tokenizer = AutoTokenizer.from_pretrained(model)
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
pipeline = transformers.pipeline(
"text-generation",
model=model,
torch_dtype=torch.float16,
device_map="auto",
)
outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])
```
## Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-04
- train_batch_size: 1
- gradient_accumulation_steps: 8
- total_train_batch_size: 8
- optimizer: AdamOptimizer32bit
- lr_scheduler_type: cosine
- lr_scheduler_warmup_steps: 100
- training_steps: 1080
## Framework versions
- Transformers 4.38.0.dev0
- Pytorch 2.1.2+cu118
- Datasets 2.17.0
- Tokenizers 0.15.0 |