writinwaters
commited on
Commit
·
c2523f0
1
Parent(s):
cd84e5d
Fixed a docusaurus display issue (#1431)
Browse files### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] Documentation Update
docs/guides/deploy_local_llm.md
CHANGED
@@ -236,32 +236,28 @@ You may launch the Ollama service as below:
|
|
236 |
ollama serve
|
237 |
```
|
238 |
|
239 |
-
|
240 |
> Please set environment variable `OLLAMA_NUM_GPU` to `999` to make sure all layers of your model are running on Intel GPU, otherwise, some layers may run on CPU.
|
241 |
|
242 |
-
|
243 |
> If your local LLM is running on Intel Arc™ A-Series Graphics with Linux OS (Kernel 6.2), it is recommended to additionaly set the following environment variable for optimal performance before executing `ollama serve`:
|
244 |
>
|
245 |
> ```bash
|
246 |
> export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
247 |
> ```
|
248 |
|
249 |
-
|
250 |
> To allow the service to accept connections from all IP addresses, use `OLLAMA_HOST=0.0.0.0 ./ollama serve` instead of just `./ollama serve`.
|
251 |
|
252 |
The console will display messages similar to the following:
|
253 |
|
254 |
-
|
255 |
-
<img src="https://llm-assets.readthedocs.io/en/latest/_images/ollama_serve.png" width=100%; />
|
256 |
-
</a>
|
257 |
|
258 |
### 3. Pull and Run Ollama Model
|
259 |
|
260 |
Keep the Ollama service on and open another terminal and run `./ollama pull <model_name>` in Linux (`ollama.exe pull <model_name>` in Windows) to automatically pull a model. e.g. `qwen2:latest`:
|
261 |
|
262 |
-
|
263 |
-
<img src="https://llm-assets.readthedocs.io/en/latest/_images/ollama_pull.png" width=100%; />
|
264 |
-
</a>
|
265 |
|
266 |
#### Run Ollama Model
|
267 |
|
|
|
236 |
ollama serve
|
237 |
```
|
238 |
|
239 |
+
|
240 |
> Please set environment variable `OLLAMA_NUM_GPU` to `999` to make sure all layers of your model are running on Intel GPU, otherwise, some layers may run on CPU.
|
241 |
|
242 |
+
|
243 |
> If your local LLM is running on Intel Arc™ A-Series Graphics with Linux OS (Kernel 6.2), it is recommended to additionaly set the following environment variable for optimal performance before executing `ollama serve`:
|
244 |
>
|
245 |
> ```bash
|
246 |
> export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
247 |
> ```
|
248 |
|
249 |
+
|
250 |
> To allow the service to accept connections from all IP addresses, use `OLLAMA_HOST=0.0.0.0 ./ollama serve` instead of just `./ollama serve`.
|
251 |
|
252 |
The console will display messages similar to the following:
|
253 |
|
254 |
+

|
|
|
|
|
255 |
|
256 |
### 3. Pull and Run Ollama Model
|
257 |
|
258 |
Keep the Ollama service on and open another terminal and run `./ollama pull <model_name>` in Linux (`ollama.exe pull <model_name>` in Windows) to automatically pull a model. e.g. `qwen2:latest`:
|
259 |
|
260 |
+

|
|
|
|
|
261 |
|
262 |
#### Run Ollama Model
|
263 |
|