metadata
license: apache-2.0
datasets:
- Intel/orca_dpo_pairs
Model Summary
Neuralphi-2 is an experiment in DPO finetuning. It was made following Max Labonne's excellent article about fine-tuning mistral-7b. Neuralphi-2 is phi-2-sft finetuned using DPO with Intel/orca_dpo_pairs.
Prompt Format
"""### Human: {instruction}
### Assistant:"""