File size: 1,137 Bytes
225ab0a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6e149ac
 
e367712
 
 
 
 
5b00af9
e367712
 
 
 
69473f6
e367712
 
 
 
 
2f6defd
225ab0a
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
---
pipeline_tag: text-generation
inference: false
tags:
- facebook
- meta
- llama
- llama-2
- mlx
---

# **Llama 2**

Llama 2 is a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 70 billion parameters. This is the repository for the 7B fine-tuned model, in `npz` format suitable for use in Apple's MLX framework.

Weights have been converted to `float16` from the original `bfloat16` type, because `numpy` is not compatible with `bfloat16` out of the box.

How to use with [MLX](https://github.com/ml-explore/mlx).

```bash
# Install mlx, mlx-examples, huggingface-cli
pip install mlx
pip install huggingface_hub hf_transfer
git clone https://github.com/ml-explore/mlx-examples.git

# Download model
export HF_HUB_ENABLE_HF_TRANSFER=1
huggingface-cli download --local-dir Llama-2-7b-chat-mlx mlx-llama/Llama-2-7b-chat-mlx

# Run example
python mlx-examples/llama/llama.py Llama-2-7b-chat-mlx/Llama-2-7b-chat.npz Llama-2-7b-chat-mlx/tokenizer.model "My name is "
```

Please, refer to the [original model card](https://huggingface.co/meta-llama/Llama-2-7b-chat) for details on Llama 2.