metadata
language:
- en
- de
license: apache-2.0
tags:
- text-generation-inference
- transformers
- unsloth
- llama
- trl
- orpo
base_model: cstr/phi-3-orpo-v8_16
Uploaded model
- Developed by: cstr
- License: apache-2.0
- Finetuned from model : cstr/phi-3-orpo-v8_16
This is a quick experiment with only 1000 orpo steps from a german dataset.
This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.