writinwaters commited on
Commit
c2523f0
·
1 Parent(s): cd84e5d

Fixed a docusaurus display issue (#1431)

Browse files

### What problem does this PR solve?

_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change


- [x] Documentation Update

Files changed (1) hide show
  1. docs/guides/deploy_local_llm.md +5 -9
docs/guides/deploy_local_llm.md CHANGED
@@ -236,32 +236,28 @@ You may launch the Ollama service as below:
236
  ollama serve
237
  ```
238
 
239
- > [!NOTE]
240
  > Please set environment variable `OLLAMA_NUM_GPU` to `999` to make sure all layers of your model are running on Intel GPU, otherwise, some layers may run on CPU.
241
 
242
- > [!TIP]
243
  > If your local LLM is running on Intel Arc™ A-Series Graphics with Linux OS (Kernel 6.2), it is recommended to additionaly set the following environment variable for optimal performance before executing `ollama serve`:
244
  >
245
  > ```bash
246
  > export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
247
  > ```
248
 
249
- > [!NOTE]
250
  > To allow the service to accept connections from all IP addresses, use `OLLAMA_HOST=0.0.0.0 ./ollama serve` instead of just `./ollama serve`.
251
 
252
  The console will display messages similar to the following:
253
 
254
- <a href="https://llm-assets.readthedocs.io/en/latest/_images/ollama_serve.png" target="_blank">
255
- <img src="https://llm-assets.readthedocs.io/en/latest/_images/ollama_serve.png" width=100%; />
256
- </a>
257
 
258
  ### 3. Pull and Run Ollama Model
259
 
260
  Keep the Ollama service on and open another terminal and run `./ollama pull <model_name>` in Linux (`ollama.exe pull <model_name>` in Windows) to automatically pull a model. e.g. `qwen2:latest`:
261
 
262
- <a href="https://llm-assets.readthedocs.io/en/latest/_images/ollama_pull.png" target="_blank">
263
- <img src="https://llm-assets.readthedocs.io/en/latest/_images/ollama_pull.png" width=100%; />
264
- </a>
265
 
266
  #### Run Ollama Model
267
 
 
236
  ollama serve
237
  ```
238
 
239
+
240
  > Please set environment variable `OLLAMA_NUM_GPU` to `999` to make sure all layers of your model are running on Intel GPU, otherwise, some layers may run on CPU.
241
 
242
+
243
  > If your local LLM is running on Intel Arc™ A-Series Graphics with Linux OS (Kernel 6.2), it is recommended to additionaly set the following environment variable for optimal performance before executing `ollama serve`:
244
  >
245
  > ```bash
246
  > export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
247
  > ```
248
 
249
+
250
  > To allow the service to accept connections from all IP addresses, use `OLLAMA_HOST=0.0.0.0 ./ollama serve` instead of just `./ollama serve`.
251
 
252
  The console will display messages similar to the following:
253
 
254
+ ![](https://llm-assets.readthedocs.io/en/latest/_images/ollama_serve.png)
 
 
255
 
256
  ### 3. Pull and Run Ollama Model
257
 
258
  Keep the Ollama service on and open another terminal and run `./ollama pull <model_name>` in Linux (`ollama.exe pull <model_name>` in Windows) to automatically pull a model. e.g. `qwen2:latest`:
259
 
260
+ ![](https://llm-assets.readthedocs.io/en/latest/_images/ollama_pull.png)
 
 
261
 
262
  #### Run Ollama Model
263