Llama 3-8B Turkish Model

This repo contains the experimental-educational fine-tuned model of Meta's new Llama 3.2-1B that can be used for different purposes.

Trained with NVIDIA RTX 3070 Ti, took around 6 hours.

Example Usages

You can use it from Transformers:

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("myzens/llama3-8b-tr-finetuned")
model = AutoModelForCausalLM.from_pretrained("myzens/llama3-8b-tr-finetuned")

alpaca_prompt = """
Instruction:
{}

Input:
{}

Response:
{}"""

inputs = tokenizer([
    alpaca_prompt.format(
        "",
        "Ankara'da gezilebilecek 3 yeri söyle ve ne olduklarını kısaca açıkla.",
        "",
)], return_tensors = "pt").to("cuda")


outputs = model.generate(**inputs, max_new_tokens=192)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Transformers Pipeline:

from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline

tokenizer = AutoTokenizer.from_pretrained("myzens/llama3-8b-tr-finetuned")
model = AutoModelForCausalLM.from_pretrained("myzens/llama3-8b-tr-finetuned")

pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)

alpaca_prompt = """
Instruction:
{}

Input:
{}

Response:
{}"""

input = alpaca_prompt.format(
        "",
        "Ankara'da gezilebilecek 3 yeri söyle ve ne olduklarını kısaca açıkla.",
        "",
)

pipe(input)

Output:

Instruction:


Input:
Ankara'da gezilebilecek 3 yeri söyle ve ne olduklarını kısaca açıkla.

Response:
1. Anıtkabir - Mustafa Kemal Atatürk'ün mezarı
2. Gençlik ve Spor Sarayı - spor etkinliklerinin yapıldığı yer
3. Kızılay Meydanı - Ankara'nın merkezinde bulunan bir meydan

Important Notes

We recommend you to use an Alpaca Prompt Template or another template, otherwise you can see generations with no meanings or repeating the same sentence constantly.
Use the model with a CUDA supported GPU.

Fine-tuned by emre570.

emre570
/

llama3.2-1b-tr-qlora

Llama 3-8B Turkish Model

Example Usages

Important Notes

Model tree for emre570/llama3.2-1b-tr-qlora

Dataset used to train emre570/llama3.2-1b-tr-qlora