File size: 1,368 Bytes
b7c7474 47e53f9 f37de11 6f2e91d f37de11 6f2e91d 981f3b2 6f2e91d b7c7474 47e53f9 b7c7474 47e53f9 f37de11 192f79a f37de11 47e53f9 f37de11 47e53f9 f37de11 47e53f9 b7c7474 f37de11 981f3b2 47e53f9 f37de11 b7c7474 47e53f9 f37de11 47e53f9 b7c7474 981f3b2 f37de11 b7c7474 47e53f9 b7c7474 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 |
---
base_model: meta-llama/Meta-Llama-3.1-70B-Instruct
language:
- en
- de
- fr
- it
- pt
- hi
- es
- th
library_name: transformers
license: llama3.1
tags:
- facebook
- meta
- pytorch
- llama
- llama-3
model-index:
- name: Meta-Llama-3.1-70B-Instruct-NF4
results: []
---
# Model Card for Model ID
<!-- Provide a quick summary of what the model is/does. -->
This is a quantized version of `Llama 3.1 70B Instruct`. Quantized to **4-bit** using `bistandbytes` and `accelerate`.
- **Developed by:** Farid Saud @ DSRS
- **License:** llama3.1
- **Base Model:** meta-llama/Meta-Llama-3.1-70B-Instruct
## Use this model
Use a pipeline as a high-level helper:
```python
# Use a pipeline as a high-level helper
from transformers import pipeline
messages = [
{"role": "user", "content": "Who are you?"},
]
pipe = pipeline("text-generation", model="fsaudm/Meta-Llama-3.1-70B-Instruct-NF4")
pipe(messages)
```
Load model directly
```python
# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("fsaudm/Meta-Llama-3.1-70B-Instruct-NF4")
model = AutoModelForCausalLM.from_pretrained("fsaudm/Meta-Llama-3.1-70B-Instruct-NF4")
```
The base model information can be found in the original [meta-llama/Meta-Llama-3.1-70B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-70B-Instruct)
|