neuralphi-2 / README.md

xz56

Update README.md

38bbf30 verified 12 months ago

preview code

raw

history blame contribute delete

566 Bytes

metadata

license: apache-2.0
datasets:
  - Intel/orca_dpo_pairs

Model Summary

Neuralphi-2 is an experiment in DPO finetuning. It was made following Max Labonne's excellent article about fine-tuning mistral-7b. Neuralphi-2 is phi-2-sft finetuned using DPO with Intel/orca_dpo_pairs.

Prompt Format

"""### Human: {instruction}

### Assistant:"""