Spaces:
Sleeping
Sleeping
lukestanley
commited on
Commit
·
2831e2c
1
Parent(s):
a97efcc
Add more detailed setup notes with GPU, fork, and other pip dependencies
Browse files
README.md
CHANGED
@@ -38,31 +38,36 @@ Could Reddit, Twitter, Hacker News, or even YouTube comments be more calm and co
|
|
38 |
|
39 |
### Installation
|
40 |
|
41 |
-
1.
|
42 |
-
|
43 |
-
|
44 |
-
|
45 |
-
|
46 |
-
|
47 |
-
|
48 |
-
|
49 |
-
|
50 |
-
|
51 |
-
|
52 |
-
|
53 |
-
|
54 |
-
|
55 |
-
|
56 |
-
|
57 |
-
|
58 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
59 |
|
60 |
### Usage
|
61 |
|
62 |
-
ChillTranslator currently has an example spicy comment it works on fixing right away.
|
63 |
-
This is how to see it in action:
|
64 |
```python
|
65 |
-
|
66 |
```
|
67 |
|
68 |
## Contributing 🤝
|
|
|
38 |
|
39 |
### Installation
|
40 |
|
41 |
+
1. Clone the Project Repository:
|
42 |
+
```
|
43 |
+
git clone https://github.com/lukestanley/ChillTranslator.git
|
44 |
+
cd ChillTranslator
|
45 |
+
```
|
46 |
+
2. Download a compatible and capable model like: [Mixtral-8x7B-Instruct-v0.1-GGUF](https://huggingface.co/TheBloke/Mixtral-8x7B-Instruct-v0.1-GGUF/resolve/main/mixtral-8x7b-instruct-v0.1.Q4_K_M.gguf?download=true). E.g:
|
47 |
+
```
|
48 |
+
wget https://huggingface.co/TheBloke/Mixtral-8x7B-Instruct-v0.1-GGUF/resolve/main/mixtral-8x7b-instruct-v0.1.Q4_K_M.gguf?download=true -O mixtral-8x7b-instruct-v0.1.Q4_K_M.gguf &
|
49 |
+
```
|
50 |
+
3. Install dependencies, including a special fork of `llama-cpp-python`, and Nvidia GPU support if needed:
|
51 |
+
```
|
52 |
+
pip install requests pydantic uvicorn starlette fastapi sse_starlette starlette_context pydantic_settings
|
53 |
+
|
54 |
+
# If you have an Nvidia GPU, install the special fork of llama-cpp-python with CUBLAS support:
|
55 |
+
CMAKE_ARGS="-DLLAMA_CUBLAS=on" pip install git+https://github.com/lukestanley/llama-cpp-python.git@expose_json_grammar_convert_function
|
56 |
+
```
|
57 |
+
If you don't have an Nvidia GPU, the `CMAKE_ARGS="-DLLAMA_CUBLAS=on"` is not needed before the `pip install` command.
|
58 |
+
|
59 |
+
4. Start the LLM server with your chosen configuration. Example for Nvidia with `--n_gpu_layers` set to 20; different GPUs fit more or less layers. If you have no GPU, you don't need the `--n_gpu_layers` flag:
|
60 |
+
```
|
61 |
+
python3 -m llama_cpp.server --model mixtral-8x7b-instruct-v0.1.Q4_K_M.gguf --port 5834 --n_ctx 4096 --use_mlock false --n_gpu_layers 20 &
|
62 |
+
```
|
63 |
+
These config options may need tweaking. Please check out https://llama-cpp-python.readthedocs.io/en/latest/ for more info.
|
64 |
+
|
65 |
|
66 |
### Usage
|
67 |
|
68 |
+
ChillTranslator currently has an example spicy comment it works on fixing right away. This is how to see it in action:
|
|
|
69 |
```python
|
70 |
+
python3 chill.py
|
71 |
```
|
72 |
|
73 |
## Contributing 🤝
|