oguzhandoganoglu
commited on
Commit
•
b075f1f
1
Parent(s):
443524a
Update README.md
Browse files
README.md
CHANGED
@@ -4,17 +4,17 @@ language:
|
|
4 |
- tr
|
5 |
|
6 |
---
|
7 |
-
<img src="https://
|
8 |
alt="CEREBRUM LLM" width="420"/>
|
9 |
|
10 |
|
11 |
-
# CERE
|
12 |
|
13 |
-
This model is an fine-tuned version of a Llama3
|
14 |
|
15 |
## Model Details
|
16 |
|
17 |
-
- **Base Model**: LLMA 3
|
18 |
- **Tokenizer Extension**: Specifically extended for Turkish
|
19 |
- **Training Dataset**: Cleaned Turkish raw data with 5 billion tokens, custom Turkish instruction sets
|
20 |
- **Training Method**: Initially with DORA, followed by fine-tuning with LORA
|
@@ -37,11 +37,11 @@ from transformers import AutoModelForCausalLM, AutoTokenizer
|
|
37 |
device = "cuda" # the device to load the model onto
|
38 |
|
39 |
model = AutoModelForCausalLM.from_pretrained(
|
40 |
-
"Cerebrum/cere-llama-3
|
41 |
torch_dtype="auto",
|
42 |
device_map="auto"
|
43 |
)
|
44 |
-
tokenizer = AutoTokenizer.from_pretrained("Cerebrum/cere-llama-3
|
45 |
|
46 |
prompt = "Python'da ekrana 'Merhaba Dünya' nasıl yazılır?"
|
47 |
messages = [
|
@@ -68,4 +68,4 @@ generated_ids = [
|
|
68 |
]
|
69 |
|
70 |
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
|
71 |
-
```
|
|
|
4 |
- tr
|
5 |
|
6 |
---
|
7 |
+
<img src="https://huggingface.co/CerebrumTech/cere-llama-3-8b-tr/resolve/main/cere2.png"
|
8 |
alt="CEREBRUM LLM" width="420"/>
|
9 |
|
10 |
|
11 |
+
# CERE-LLMA-3-8b-TR
|
12 |
|
13 |
+
This model is an fine-tuned version of a Llama3 8b Large Language Model (LLM) for Turkish. It was trained on a high quality Turkish instruction sets created from various open-source and internal resources. Turkish Instruction dataset carefully annotated to carry out Turkish instructions in an accurate and organized manner.
|
14 |
|
15 |
## Model Details
|
16 |
|
17 |
+
- **Base Model**: LLMA 3 7B based LLM
|
18 |
- **Tokenizer Extension**: Specifically extended for Turkish
|
19 |
- **Training Dataset**: Cleaned Turkish raw data with 5 billion tokens, custom Turkish instruction sets
|
20 |
- **Training Method**: Initially with DORA, followed by fine-tuning with LORA
|
|
|
37 |
device = "cuda" # the device to load the model onto
|
38 |
|
39 |
model = AutoModelForCausalLM.from_pretrained(
|
40 |
+
"Cerebrum/cere-llama-3-8b-tr",
|
41 |
torch_dtype="auto",
|
42 |
device_map="auto"
|
43 |
)
|
44 |
+
tokenizer = AutoTokenizer.from_pretrained("Cerebrum/cere-llama-3-8b-tr")
|
45 |
|
46 |
prompt = "Python'da ekrana 'Merhaba Dünya' nasıl yazılır?"
|
47 |
messages = [
|
|
|
68 |
]
|
69 |
|
70 |
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
|
71 |
+
```
|