lukestanley commited on
Commit
2831e2c
·
1 Parent(s): a97efcc

Add more detailed setup notes with GPU, fork, and other pip dependencies

Browse files
Files changed (1) hide show
  1. README.md +26 -21
README.md CHANGED
@@ -38,31 +38,36 @@ Could Reddit, Twitter, Hacker News, or even YouTube comments be more calm and co
38
 
39
  ### Installation
40
 
41
- 1. Download a compatible and capable model like: [Mixtral-8x7B-Instruct-v0.1-GGUF](https://huggingface.co/TheBloke/Mixtral-8x7B-Instruct-v0.1-GGUF/resolve/main/mixtral-8x7b-instruct-v0.1.Q4_K_M.gguf?download=true)
42
- 2. Make sure it's named as expected by the next command.
43
- 3. Install dependencies:
44
- ```
45
- pip install requests pydantic llama-cpp-python llama-cpp-python[server] --upgrade
46
- ```
47
- 4. Start the LLM server:
48
- ```
49
- python3 -m llama_cpp.server --model mixtral-8x7b-instruct-v0.1.Q4_K_M.gguf --port 5834 --n_ctx 4096 --use_mlock false
50
- ```
51
- These config options are not going to be optimal for a lot of setups, as it may not use GPU right away, but this can be configured with a different argument. Please check out https://llama-cpp-python.readthedocs.io/en/latest/ for more info.
52
-
53
- 5. Get the code up:
54
- ```
55
- git clone https://github.com/lukestanley/ChillTranslator.git
56
-
57
- cd ChillTranslator
58
- ```
 
 
 
 
 
 
59
 
60
  ### Usage
61
 
62
- ChillTranslator currently has an example spicy comment it works on fixing right away.
63
- This is how to see it in action:
64
  ```python
65
- python3 chill.py
66
  ```
67
 
68
  ## Contributing 🤝
 
38
 
39
  ### Installation
40
 
41
+ 1. Clone the Project Repository:
42
+ ```
43
+ git clone https://github.com/lukestanley/ChillTranslator.git
44
+ cd ChillTranslator
45
+ ```
46
+ 2. Download a compatible and capable model like: [Mixtral-8x7B-Instruct-v0.1-GGUF](https://huggingface.co/TheBloke/Mixtral-8x7B-Instruct-v0.1-GGUF/resolve/main/mixtral-8x7b-instruct-v0.1.Q4_K_M.gguf?download=true). E.g:
47
+ ```
48
+ wget https://huggingface.co/TheBloke/Mixtral-8x7B-Instruct-v0.1-GGUF/resolve/main/mixtral-8x7b-instruct-v0.1.Q4_K_M.gguf?download=true -O mixtral-8x7b-instruct-v0.1.Q4_K_M.gguf &
49
+ ```
50
+ 3. Install dependencies, including a special fork of `llama-cpp-python`, and Nvidia GPU support if needed:
51
+ ```
52
+ pip install requests pydantic uvicorn starlette fastapi sse_starlette starlette_context pydantic_settings
53
+
54
+ # If you have an Nvidia GPU, install the special fork of llama-cpp-python with CUBLAS support:
55
+ CMAKE_ARGS="-DLLAMA_CUBLAS=on" pip install git+https://github.com/lukestanley/llama-cpp-python.git@expose_json_grammar_convert_function
56
+ ```
57
+ If you don't have an Nvidia GPU, the `CMAKE_ARGS="-DLLAMA_CUBLAS=on"` is not needed before the `pip install` command.
58
+
59
+ 4. Start the LLM server with your chosen configuration. Example for Nvidia with `--n_gpu_layers` set to 20; different GPUs fit more or less layers. If you have no GPU, you don't need the `--n_gpu_layers` flag:
60
+ ```
61
+ python3 -m llama_cpp.server --model mixtral-8x7b-instruct-v0.1.Q4_K_M.gguf --port 5834 --n_ctx 4096 --use_mlock false --n_gpu_layers 20 &
62
+ ```
63
+ These config options may need tweaking. Please check out https://llama-cpp-python.readthedocs.io/en/latest/ for more info.
64
+
65
 
66
  ### Usage
67
 
68
+ ChillTranslator currently has an example spicy comment it works on fixing right away. This is how to see it in action:
 
69
  ```python
70
+ python3 chill.py
71
  ```
72
 
73
  ## Contributing 🤝