Quantization made by Richard Erkhov.

Qwen2.5-0.5B-Instruct-MLX - GGUF

Model creator: https://huggingface.co/TheBlueObserver/
Original model: https://huggingface.co/TheBlueObserver/Qwen2.5-0.5B-Instruct-MLX/

Name	Quant method	Size
Qwen2.5-0.5B-Instruct-MLX.Q2_K.gguf	Q2_K	0.32GB
Qwen2.5-0.5B-Instruct-MLX.IQ3_XS.gguf	IQ3_XS	0.32GB
Qwen2.5-0.5B-Instruct-MLX.IQ3_S.gguf	IQ3_S	0.32GB
Qwen2.5-0.5B-Instruct-MLX.Q3_K_S.gguf	Q3_K_S	0.32GB
Qwen2.5-0.5B-Instruct-MLX.IQ3_M.gguf	IQ3_M	0.32GB
Qwen2.5-0.5B-Instruct-MLX.Q3_K.gguf	Q3_K	0.33GB
Qwen2.5-0.5B-Instruct-MLX.Q3_K_M.gguf	Q3_K_M	0.33GB
Qwen2.5-0.5B-Instruct-MLX.Q3_K_L.gguf	Q3_K_L	0.34GB
Qwen2.5-0.5B-Instruct-MLX.IQ4_XS.gguf	IQ4_XS	0.33GB
Qwen2.5-0.5B-Instruct-MLX.Q4_0.gguf	Q4_0	0.33GB
Qwen2.5-0.5B-Instruct-MLX.IQ4_NL.gguf	IQ4_NL	0.33GB
Qwen2.5-0.5B-Instruct-MLX.Q4_K_S.gguf	Q4_K_S	0.36GB
Qwen2.5-0.5B-Instruct-MLX.Q4_K.gguf	Q4_K	0.37GB
Qwen2.5-0.5B-Instruct-MLX.Q4_K_M.gguf	Q4_K_M	0.37GB
Qwen2.5-0.5B-Instruct-MLX.Q4_1.gguf	Q4_1	0.35GB
Qwen2.5-0.5B-Instruct-MLX.Q5_0.gguf	Q5_0	0.37GB
Qwen2.5-0.5B-Instruct-MLX.Q5_K_S.gguf	Q5_K_S	0.38GB
Qwen2.5-0.5B-Instruct-MLX.Q5_K.gguf	Q5_K	0.39GB
Qwen2.5-0.5B-Instruct-MLX.Q5_K_M.gguf	Q5_K_M	0.39GB
Qwen2.5-0.5B-Instruct-MLX.Q5_1.gguf	Q5_1	0.39GB
Qwen2.5-0.5B-Instruct-MLX.Q6_K.gguf	Q6_K	0.47GB
Qwen2.5-0.5B-Instruct-MLX.Q8_0.gguf	Q8_0	0.49GB

Original model description:

base_model: Qwen/Qwen2.5-0.5B-Instruct language: - en library_name: transformers license: apache-2.0 license_link: https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct/blob/main/LICENSE pipeline_tag: text-generation tags: - chat - mlx

TheBlueObserver/Qwen2.5-0.5B-Instruct-MLX

The Model TheBlueObserver/Qwen2.5-0.5B-Instruct-MLX was converted to MLX format from Qwen/Qwen2.5-0.5B-Instruct using mlx-lm version 0.20.2.

Use with mlx

pip install mlx-lm

from mlx_lm import load, generate

model, tokenizer = load("TheBlueObserver/Qwen2.5-0.5B-Instruct-MLX")

prompt="hello"

if hasattr(tokenizer, "apply_chat_template") and tokenizer.chat_template is not None:
    messages = [{"role": "user", "content": prompt}]
    prompt = tokenizer.apply_chat_template(
        messages, tokenize=False, add_generation_prompt=True
    )

response = generate(model, tokenizer, prompt=prompt, verbose=True)