File size: 3,731 Bytes
5906627 a1f3e11 1e68fe9 a1f3e11 8eec5b4 ac03e55 8eec5b4 bb67c21 8eec5b4 ac03e55 8eec5b4 ac03e55 8eec5b4 bb67c21 8eec5b4 e6462f9 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 |
---
license: llama3
language:
- tr
model-index:
- name: cere-llama-3-8b-tr
results:
- task:
type: text-generation
name: Text Generation
dataset:
name: AI2 Reasoning Challenge TR
type: ai2_arc
config: ARC-Challenge
split: test
args:
num_few_shot: 25
metrics:
- type: acc
value: 44.03
name: accuracy
- task:
type: text-generation
name: Text Generation
dataset:
name: HellaSwag TR
type: hellaswag
split: validation
args:
num_few_shot: 10
metrics:
- type: acc
value: 46.73
name: accuracy
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU TR
type: cais/mmlu
config: all
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 49.11
name: accuracy
- task:
type: text-generation
name: Text Generation
dataset:
name: TruthfulQA TR
type: truthful_qa
config: multiple_choice
split: validation
args:
num_few_shot: 0
metrics:
- type: acc
name: accuracy
value: 48.21
- task:
type: text-generation
name: Text Generation
dataset:
name: Winogrande TR
type: winogrande
config: winogrande_xl
split: validation
args:
num_few_shot: 10
metrics:
- type: acc
value: 54.98
name: accuracy
- task:
type: text-generation
name: Text Generation
dataset:
name: GSM8k TR
type: gsm8k
config: main
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 51.78
name: accuracy
---
# CERE-LLMA-3-8b-TR
This model is an fine-tuned version of a Llama3 8b Large Language Model (LLM) for Turkish. It was trained on a high quality Turkish instruction sets created from various open-source and internal resources. Turkish Instruction dataset carefully annotated to carry out Turkish instructions in an accurate and organized manner.
## Model Details
- **Base Model**: LLMA 3 7B based LLM
- **Tokenizer Extension**: Specifically extended for Turkish
- **Training Dataset**: Cleaned Turkish raw data with 5 billion tokens, custom Turkish instruction sets
- **Training Method**: Initially with DORA, followed by fine-tuning with LORA
[Open LLM Turkish Leaderboard v0.2 Evaluation Results]
Metric Value
Avg.
AI2 Reasoning Challenge_tr
HellaSwag_tr
MMLU_tr
TruthfulQA_tr
Winogrande _tr
GSM8k_tr
## Usage Examples
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
device = "cuda" # the device to load the model onto
model = AutoModelForCausalLM.from_pretrained(
"Cerebrum/cere-llama-3-8b-tr",
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("Cerebrum/cere-llama-3-8b-tr")
prompt = "Python'da ekrana 'Merhaba Dünya' nasıl yazılır?"
messages = [
{"role": "system", "content": "Sen, Cerebrum Tech tarafından üretilen ve verilen talimatları takip ederek en iyi cevabı üretmeye çalışan yardımcı bir yapay zekasın."},
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(device)
generated_ids = model.generate(
model_inputs.input_ids,
temperature=0.3,
top_k=50,
top_p=0.9,
max_new_tokens=512,
repetition_penalty=1,
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
``` |