|
--- |
|
datasets: |
|
- TFLai/Turkish-Alpaca |
|
language: |
|
- tr |
|
--- |
|
|
|
# Model Card for Model ID |
|
|
|
<!-- Provide a quick summary of what the model is/does. --> |
|
malhajar/Mixtral-8x7B-v0.1-turkish is a finetuned version of Mixtral-8x7B-v0.1 using SFT Training. |
|
This model can answer information in turkish language as it is finetuned on a turkish dataset specifically [`Turkish-Alpaca`]( https://huggingface.co/datasets/TFLai/Turkish-Alpaca) |
|
|
|
### Model Description |
|
|
|
- **Developed by:** [`Mohamad Alhajar`](https://www.linkedin.com/in/muhammet-alhajar/) |
|
- **Language(s) (NLP):** Turkish |
|
- **Finetuned from model:** [`mistralai/Mixtral-8x7B-v0.1`](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1) |
|
|
|
### Prompt Template |
|
``` |
|
### Instruction: |
|
|
|
<prompt> (without the <>) |
|
|
|
### Response: |
|
``` |
|
## How to Get Started with the Model |
|
|
|
Use the code sample provided in the original post to interact with the model. |
|
```python |
|
from transformers import AutoTokenizer,AutoModelForCausalLM |
|
|
|
model_id = "malhajar/Mixtral-8x7B-v0.1-turkish" |
|
model = AutoModelForCausalLM.from_pretrained(model_name_or_path, |
|
device_map="auto", |
|
torch_dtype=torch.float16, |
|
revision="main") |
|
|
|
tokenizer = AutoTokenizer.from_pretrained(model_id) |
|
|
|
question: "Türkiyenin en büyük şehir nedir?" |
|
# For generating a response |
|
prompt = f''' |
|
### Instruction: {question} ### Response: |
|
''' |
|
input_ids = tokenizer(prompt, return_tensors="pt").input_ids |
|
output = model.generate(inputs=input_ids,max_new_tokens=512,pad_token_id=tokenizer.eos_token_id,top_k=50, do_sample=True,repetition_penalty=1.3 |
|
top_p=0.95,trust_remote_code=True,) |
|
response = tokenizer.decode(output[0]) |
|
|
|
print(response) |
|
``` |