tabedini's picture
Update README.md
4734c80 verified
metadata
license: llama3
language:
  - fa
  - en
library_name: transformers
tags:
  - LLM
  - llama-3
  - PartAI
  - conversational
base_model:
  - meta-llama/Meta-Llama-3-8B-Instruct
  - PartAI/Dorna-Llama3-8B-Instruct

Model Details

The Dorna models are a family of decoder-only models, specifically trained/fine-tuned on Persian data, developed by Part AI. As an initial release, an 8B instruct model from this family is Dorna-Llama3-8B-Instruct is built using the Meta Llama 3 Instruct model.

In this repo, we provide bf16 model and quantized models in the GGUF formats, including Q2_K, Q3_K, Q3_K_L, Q3_K_M, Q3_K_S, Q4_0, Q4_1, Q4_K_M, Q4_K_S, Q5_0, Q5_1, Q5_K_M, Q5_K_S and Q8_0

Here offers an in-depth report that includes several performance charts. Check it out.

Name Quant Method Bits Memory
dorna-llama3-8b-instruct.Q2_K.gguf Q2_K 2 3.2 GB
dorna-llama3-8b-instruct.Q3_K_L.gguf Q3_K_L 3 4.3 GB
dorna-llama3-8b-instruct.Q3_K_M.gguf Q3_K_M 3 4.1 GB
dorna-llama3-8b-instruct.Q3_K_S.gguf Q3_K_S 3 3.7 GB
dorna-llama3-8b-instruct.Q4_0.gguf Q4_1 4 4.7 GB
dorna-llama3-8b-instruct.Q4_1.gguf Q4_1 4 5.2 GB
dorna-llama3-8b-instruct.Q4_K_M.gguf Q4_K_M 4 4.9 GB
dorna-llama3-8b-instruct.Q4_K_S.gguf Q4_K_S 4 4.7 GB
dorna-llama3-8b-instruct.Q5_0.gguf Q5_0 5 5.6 GB
dorna-llama3-8b-instruct.Q5_1.gguf Q5_1 5 6.1 GB
dorna-llama3-8b-instruct.Q5_K_M.gguf Q5_K_M 5 5.73 GB
dorna-llama3-8b-instruct.Q5_K_S.gguf Q5_K_S 5 5.6 GB
dorna-llama3-8b-instruct.Q6_K.gguf Q6_K 6 6.6 GB
dorna-llama3-8b-instruct.Q8_0.gguf Recommended Q8_0 8 8.5 GB
dorna-llama3-8b-instruct.bf16.gguf None 16 16.2 GB

Requirements

We recommend using the Python version of llama.cpp and installing it with the following command:

!pip install https://github.com/abetlen/llama-cpp-python/releases/download/v0.2.78/llama_cpp_python-0.2.78-cp310-cp310-linux_x86_64.whl

How to use

Instead of cloning the repository, which may be inefficient, you can manually download the required GGUF file or use huggingface-cli (pip install huggingface_hub) as demonstrated below:

!huggingface-cli login --token $HUGGING_FACE_HUB_TOKEN
!huggingface-cli download PartAI/Dorna-Llama3-8B-Instruct-GGUF dorna-llama3-8b-instruct.Q8_0.gguf --local-dir . --local-dir-use-symlinks False
from llama_cpp import Llama

llm = Llama(
      model_path="dorna-llama3-8b-instruct.Q8_0.gguf",
      chat_format="llama-3",
      n_gpu_layers=-1,
      n_ctx=2048,

)

messages = [
    {"role": "system", "content": "You are a helpful Persian assistant. Please answer questions in the asked language."},
    {"role": "user", "content": "کاغذ A4 بزرگ تر است یا A5؟"},
]
result = llm.create_chat_completion(
    messages = messages,
    top_p=0.85,
    temperature=0.1

)

print(result)

Contact us

If you have any questions regarding this model, you can reach us via the community on Hugging Face.