munish0838 commited on
Commit
082ab9e
1 Parent(s): 34e32be

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +78 -0
README.md ADDED
@@ -0,0 +1,78 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ---
3
+
4
+ license: apache-2.0
5
+ language:
6
+ - tr
7
+
8
+ ---
9
+
10
+ [![QuantFactory Banner](https://lh7-rt.googleusercontent.com/docsz/AD_4nXeiuCm7c8lEwEJuRey9kiVZsRn2W-b4pWlu3-X534V3YmVuVc2ZL-NXg2RkzSOOS2JXGHutDuyyNAUtdJI65jGTo8jT9Y99tMi4H4MqL44Uc5QKG77B0d6-JfIkZHFaUA71-RtjyYZWVIhqsNZcx8-OMaA?key=xt3VSDoCbmTY7o-cwwOFwQ)](https://hf.co/QuantFactory)
11
+
12
+
13
+ # QuantFactory/Turkcell-LLM-7b-v1-GGUF
14
+ This is quantized version of [TURKCELL/Turkcell-LLM-7b-v1](https://huggingface.co/TURKCELL/Turkcell-LLM-7b-v1) created using llama.cpp
15
+
16
+ # Original Model Card
17
+
18
+
19
+ <img src="https://huggingface.co/TURKCELL/Turkcell-LLM-7b-v1/resolve/main/icon.jpeg"
20
+ alt="Turkcell LLM" width="300"/>
21
+
22
+ # Turkcell-LLM-7b-v1
23
+
24
+ This model is an extended version of a Mistral-based Large Language Model (LLM) for Turkish. It was trained on a cleaned Turkish raw dataset containing 5 billion tokens. The training process involved using the DORA method initially. Following this, we utilized Turkish instruction sets created from various open-source and internal resources for fine-tuning with the LORA method.
25
+
26
+ ## Model Details
27
+
28
+ - **Base Model**: Mistral 7B based LLM
29
+ - **Tokenizer Extension**: Specifically extended for Turkish
30
+ - **Training Dataset**: Cleaned Turkish raw data with 5 billion tokens, custom Turkish instruction sets
31
+ - **Training Method**: Initially with DORA, followed by fine-tuning with LORA
32
+
33
+ ### DORA Configuration
34
+
35
+ - `lora_alpha`: 128
36
+ - `lora_dropout`: 0.05
37
+ - `r`: 64
38
+ - `target_modules`: "all-linear"
39
+
40
+
41
+ ### LORA Fine-Tuning Configuration
42
+
43
+ - `lora_alpha`: 128
44
+ - `lora_dropout`: 0.05
45
+ - `r`: 256
46
+ - `target_modules`: "all-linear"
47
+
48
+ ## Usage Examples
49
+
50
+ ```python
51
+
52
+ from transformers import AutoModelForCausalLM, AutoTokenizer
53
+
54
+ device = "cuda" # the device to load the model onto
55
+
56
+ model = AutoModelForCausalLM.from_pretrained("TURKCELL/Turkcell-LLM-7b-v1")
57
+ tokenizer = AutoTokenizer.from_pretrained("TURKCELL/Turkcell-LLM-7b-v1")
58
+
59
+ messages = [
60
+ {"role": "user", "content": "Türkiye'nin başkenti neresidir?"},
61
+ ]
62
+
63
+ encodeds = tokenizer.apply_chat_template(messages, return_tensors="pt")
64
+
65
+ eos_token = tokenizer("<|im_end|>",add_special_tokens=False)["input_ids"][0]
66
+
67
+ model_inputs = encodeds.to(device)
68
+ model.to(device)
69
+
70
+ generated_ids = model.generate(model_inputs,
71
+ max_new_tokens=1024,
72
+ do_sample=True,
73
+ eos_token_id=eos_token)
74
+
75
+ decoded = tokenizer.batch_decode(generated_ids)
76
+ print(decoded[0])
77
+
78
+