File size: 843 Bytes
ea1e681 4a093b0 e35d8f9 e7a19a1 e35d8f9 e7a19a1 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
---
license: cc-by-nc-4.0
language:
- en
pipeline_tag: conversational
tags:
- fair-use
- llama2
- ggml
---
# Llama2 Movie Character Finetuned 7F Quantized
Quantized by llama.cpp (https://github.com/ggerganov/llama.cpp)
Either use by llama.cpp or llama cpp python wrappers
ctransformers example
pip install ctransformers
```
from ctransformers import AutoModelForCausalLM
llm = AutoModelForCausalLM.from_pretrained("llama2-7f-fp16-ggml-q4.bin",
model_type='llama',
gpu_layers=100, # you can use less (like 20) if you have less gpu ram
max_new_tokens=50,
stop=["###","##"],
threads=4) # 4 cpu threads to limit cpu
``` |