QuantFactory
/

Turkcell-LLM-7b-v1-GGUF

GGUF

Turkish

Inference Endpoints

conversational

Model card Files Files and versions Community

munish0838 commited on 18 days ago

Commit

082ab9e

•

1 Parent(s): 34e32be

Upload README.md with huggingface_hub

Browse files

Files changed (1) hide show

README.md +78 -0

README.md ADDED Viewed

	@@ -0,0 +1,78 @@

+---
+license: apache-2.0
+language:
+- tr
+---
+[![QuantFactory Banner](https://lh7-rt.googleusercontent.com/docsz/AD_4nXeiuCm7c8lEwEJuRey9kiVZsRn2W-b4pWlu3-X534V3YmVuVc2ZL-NXg2RkzSOOS2JXGHutDuyyNAUtdJI65jGTo8jT9Y99tMi4H4MqL44Uc5QKG77B0d6-JfIkZHFaUA71-RtjyYZWVIhqsNZcx8-OMaA?key=xt3VSDoCbmTY7o-cwwOFwQ)](https://hf.co/QuantFactory)
+# QuantFactory/Turkcell-LLM-7b-v1-GGUF
+This is quantized version of [TURKCELL/Turkcell-LLM-7b-v1](https://huggingface.co/TURKCELL/Turkcell-LLM-7b-v1) created using llama.cpp
+# Original Model Card
+<img src="https://huggingface.co/TURKCELL/Turkcell-LLM-7b-v1/resolve/main/icon.jpeg"
+alt="Turkcell LLM" width="300"/>
+# Turkcell-LLM-7b-v1
+This model is an extended version of a Mistral-based Large Language Model (LLM) for Turkish. It was trained on a cleaned Turkish raw dataset containing 5 billion tokens. The training process involved using the DORA method initially. Following this, we utilized Turkish instruction sets created from various open-source and internal resources for fine-tuning with the LORA method.
+## Model Details
+- **Base Model**: Mistral 7B based LLM
+- **Tokenizer Extension**: Specifically extended for Turkish
+- **Training Dataset**: Cleaned Turkish raw data with 5 billion tokens, custom Turkish instruction sets
+- **Training Method**: Initially with DORA, followed by fine-tuning with LORA
+### DORA Configuration
+- `lora_alpha`: 128
+- `lora_dropout`: 0.05
+- `r`: 64
+- `target_modules`: "all-linear"
+### LORA Fine-Tuning Configuration
+- `lora_alpha`: 128
+- `lora_dropout`: 0.05
+- `r`: 256
+- `target_modules`: "all-linear"
+## Usage Examples
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+device = "cuda" # the device to load the model onto
+model = AutoModelForCausalLM.from_pretrained("TURKCELL/Turkcell-LLM-7b-v1")
+tokenizer = AutoTokenizer.from_pretrained("TURKCELL/Turkcell-LLM-7b-v1")
+messages = [
+    {"role": "user", "content": "Türkiye'nin başkenti neresidir?"},
+]
+encodeds = tokenizer.apply_chat_template(messages, return_tensors="pt")
+eos_token = tokenizer("<|im_end|>",add_special_tokens=False)["input_ids"][0]
+model_inputs = encodeds.to(device)
+model.to(device)
+generated_ids = model.generate(model_inputs,
+                               max_new_tokens=1024,
+                               do_sample=True,
+                               eos_token_id=eos_token)
+decoded = tokenizer.batch_decode(generated_ids)
+print(decoded[0])