This model is a finetuned on the LLama model finetuned on the en<>yo MENYO-20k data. The default Llama-2 tokenizer is used. The wandb logs can be found here: , including 1 epoch of training on bidirectional data.