|
--- |
|
license: apache-2.0 |
|
datasets: |
|
- malhajar/alpaca-gpt4-ar |
|
language: |
|
- ar |
|
- en |
|
--- |
|
|
|
|
|
# Model Card for Model ID |
|
|
|
<!-- Provide a quick summary of what the model is/does. --> |
|
malhajar/Mistral-7B-Instruct-v0.2-turkish is a finetuned version of [`Mistral-7B-v0.1`]( https://huggingface.co/mistralai/Mistral-7B-v0.1) using SFT Training and Freeze method. |
|
This model can answer information in a chat format as it is finetuned specifically on instructions specifically [`alpaca-gpt4-ar`]( https://huggingface.co/datasets/malhajar/alpaca-gpt4-ar) |
|
|
|
### Model Description |
|
|
|
- **Developed by:** [`Mohamad Alhajar`](https://www.linkedin.com/in/muhammet-alhajar/) |
|
- **Language(s) (NLP):** Arabic |
|
- **Finetuned from model:** [`mistralai/Mistral-7B-v0.1`](https://huggingface.co/mistralai/Mistral-7B-v0.1) |
|
|
|
### Prompt Template |
|
``` |
|
### Instruction: |
|
|
|
<prompt> (without the <>) |
|
|
|
### Response: |
|
``` |
|
|
|
## How to Get Started with the Model |
|
|
|
Use the code sample provided in the original post to interact with the model. |
|
```python |
|
from transformers import AutoTokenizer,AutoModelForCausalLM |
|
|
|
model_id = "malhajar/Mistral-7B-v0.1-arabic" |
|
model = AutoModelForCausalLM.from_pretrained(model_name_or_path, |
|
device_map="auto", |
|
torch_dtype=torch.float16, |
|
revision="main") |
|
|
|
tokenizer = AutoTokenizer.from_pretrained(model_id) |
|
|
|
question: "ما هي الحياة؟" |
|
# For generating a response |
|
prompt = ''' |
|
### Instruction: {question} ### Response: |
|
''' |
|
input_ids = tokenizer(prompt, return_tensors="pt").input_ids |
|
output = model.generate(inputs=input_ids,max_new_tokens=512,pad_token_id=tokenizer.eos_token_id,top_k=50, do_sample=True,repetition_penalty=1.3 |
|
top_p=0.95,trust_remote_code=True,) |
|
response = tokenizer.decode(output[0]) |
|
|
|
print(response) |
|
``` |