allenai
/

tulu-v2.5-ppo-13b-uf-mean

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

hamishivi commited on Jun 12

Commit

1723acf

•

1 Parent(s): 2b31b4e

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -55,7 +55,7 @@ We have included a [chat template](https://huggingface.co/docs/transformers/main
 ## Intended uses & limitations
 The model was initially fine-tuned on a filtered and preprocessed of the [Tulu V2 mix dataset](https://huggingface.co/datasets/allenai/tulu-v2-sft-mixture), which contains a diverse range of human created instructions and synthetic dialogues generated primarily by other LLMs.
-We then further aligned the model with a [Jax DPO trainer](https://github.com/hamishivi/EasyLM/blob/main/EasyLM/models/llama/llama_train_dpo.py) built on [EasyLM](https://github.com/young-geng/EasyLM) on the dataset mentioned above.
 ## Bias, Risks, and Limitations

 ## Intended uses & limitations
 The model was initially fine-tuned on a filtered and preprocessed of the [Tulu V2 mix dataset](https://huggingface.co/datasets/allenai/tulu-v2-sft-mixture), which contains a diverse range of human created instructions and synthetic dialogues generated primarily by other LLMs.
+We then further aligned the model with a [Jax PPO trainer](https://github.com/hamishivi/EasyLM/blob/main/EasyLM/models/llama/llama_train_ppo.py) built on [EasyLM](https://github.com/young-geng/EasyLM) on the dataset mentioned above.
 ## Bias, Risks, and Limitations