princeton-nlp
/

Llama-3-Instruct-8B-ORPO-v0.2

Text Generation

text-generation-inference

Model card Files Files and versions Community

Update README.md

#3

by CombinHorizon - opened Oct 8, 2024

base: refs/heads/main

←

from: refs/pr/3

Discussion Files changed

Files changed (1) hide show

README.md +13 -0

README.md CHANGED Viewed

	@@ -1 +1,14 @@













1	This is a model released from the preprint: [SimPO: Simple Preference Optimization with a Reference-Free Reward](https://arxiv.org/abs/2405.14734). Please refer to our [repository](https://github.com/princeton-nlp/SimPO) for more details.

+---
+license: llama3
+library_name: transformers
+pipeline_tag: text-generation
+tags:
+- ORPO
+datasets:
+- princeton-nlp/llama3-ultrafeedback-armorm
+language:
+- en
+base_model:
+- meta-llama/Meta-Llama-3-8B-Instruct
+---
 This is a model released from the preprint: [SimPO: Simple Preference Optimization with a Reference-Free Reward](https://arxiv.org/abs/2405.14734). Please refer to our [repository](https://github.com/princeton-nlp/SimPO) for more details.