Edit model card

Overview

This model is based on rinna's [rinna/llama-3-youko-8b], fine-tuned using LoRA on a small number of parallel sentences from English to Japanese. The model has a COMET (Unbabel/wmt22-comet-da) of 0.9126 and BLEU ("tok": "ja-mecab-0.996-IPA") of 35.2 on flores200 devtest.

  • Model architecture

    A 32-layer, 4096-hidden-size transformer-based language model. Refer to the Llama 3 Model Card for architecture details.


How to use the model

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

response_template= "\n###  日本語:\n"
prefix= "###  次の英語のテキストを日本語に翻訳してください:\n英語:\n"


def create_input(text, tokenizer):
    text = f"{prefix}{text}{response_template}"
    input_ids = tokenizer.encode(text, return_tensors="pt")
    return input_ids


model_id = "lyu-boxuan/llama-3-youko-8b-En-Ja-MT-LoRA"
model = AutoModelForCausalLM.from_pretrained(
    model_id, torch_dtype=torch.bfloat16, attn_implementation="flash_attention_2"
).cuda()
tokenizer = AutoTokenizer.from_pretrained(model_id, use_fast=True)

en = "LLMs Are Here but Not Quite There Yet"
input_ids = create_input(en, tokenizer).to(model.device)
outputs = model.generate(
    input_ids,
    max_new_tokens=256,
    num_beams=5,
    do_sample=False,
    early_stopping=True,
)
response = outputs[0][input_ids.shape[-1] :]
print(tokenizer.decode(response, skip_special_tokens=True))

Tokenization

The model uses the original meta-llama/Meta-Llama-3-8B tokenizer.

References

@article{llama3modelcard,
    title={Llama 3 Model Card},
    author={AI@Meta},
    year={2024},
    url = {https://github.com/meta-llama/llama3/blob/main/MODEL_CARD.md}
}
@software{gpt-neox-library,
    title = {{GPT-NeoX: Large Scale Autoregressive Language Modeling in PyTorch}},
    author = {Andonian, Alex and Anthony, Quentin and Biderman, Stella and Black, Sid and Gali, Preetham and Gao, Leo and Hallahan, Eric and Levy-Kramer, Josh and Leahy, Connor and Nestler, Lucas and Parker, Kip and Pieler, Michael and Purohit, Shivanshu and Songz, Tri and Phil, Wang and Weinbach, Samuel},
    doi = {10.5281/zenodo.5879544},
    month = {8},
    year = {2021},
    version = {0.0.1},
    url = {https://www.github.com/eleutherai/gpt-neox},
}
@misc{rinna-llama-3-youko-8b, 
    title = {rinna/llama-3-youko-8b}, 
    author = {Mitsuda, Koh and Sawada, Kei},
    url = {https://huggingface.co/rinna/llama-3-youko-8b}, 
}
@inproceedings{sawada2024release,
    title = {Release of Pre-Trained Models for the {J}apanese Language},
    author = {Sawada, Kei and Zhao, Tianyu and Shing, Makoto and Mitsui, Kentaro and Kaga, Akio and Hono, Yukiya and Wakatsuki, Toshiaki and Mitsuda, Koh},
    booktitle = {Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)},
    month = {5},
    year = {2024},
    url = {https://arxiv.org/abs/2404.01657},
}

License

Downloads last month
24
Safetensors
Model size
8.03B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.