File size: 1,447 Bytes
f1eb360
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
# Persistent Memory Bot
A chatbot that can remember all previous conversations.
Useful for any application that requires an LM studio chatbot and functions identically to a traditional python call of a local AI Application.
## TO INSTALL:
```
Pip install flask install 
Pip3 install huggingface-hub
CMAKE_ARGS="-DLLAMA_CUBLAS=on" pip install llama-cpp-python

```
## Full docs:
```
# Base ctransformers with no GPU acceleration
pip install llama-cpp-python
# With NVidia CUDA acceleration
CMAKE_ARGS="-DLLAMA_CUBLAS=on" pip install llama-cpp-python
# Or with OpenBLAS acceleration
CMAKE_ARGS="-DLLAMA_BLAS=ON -DLLAMA_BLAS_VENDOR=OpenBLAS" pip install llama-cpp-python
# Or with CLBLast acceleration
CMAKE_ARGS="-DLLAMA_CLBLAST=on" pip install llama-cpp-python
# Or with AMD ROCm GPU acceleration (Linux only)
CMAKE_ARGS="-DLLAMA_HIPBLAS=on" pip install llama-cpp-python
# Or with Metal GPU acceleration for macOS systems only
CMAKE_ARGS="-DLLAMA_METAL=on" pip install llama-cpp-python

# In windows, to set the variables CMAKE_ARGS in PowerShell, follow this format; eg for NVidia CUDA:
$env:CMAKE_ARGS = "-DLLAMA_OPENBLAS=on"
pip install llama-cpp-python

huggingface-cli download TheBloke/Silicon-Maid-7B-GGUF silicon-maid-7b.Q4_K_M.gguf --local-dir . --local-dir-use-symlinks False

huggingface-cli download lmstudio-community/Meta-Llama-3-8B-Instruct-GGUF  Meta-Llama-3-8B-Instruct-Q8_0.gguf --local-dir . --local-dir-use-symlinks False


```