metadata
language:
- en
- de
license: apache-2.0
tags:
- text-generation-inference
- transformers
- unsloth
- llama
- trl
- orpo
base_model: vonjack/Phi-3-mini-4k-instruct-LLaMAfied
- Developed by: cstr
- License: apache-2.0
- Finetuned from model : vonjack/Phi-3-mini-4k-instruct-LLaMAfied
This is a quick experiment with only 150 orpo steps from a german dataset.
This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.