|
--- |
|
sidebar_position: 5 |
|
slug: /deploy_local_llm |
|
--- |
|
|
|
# Deploy a local LLM |
|
|
|
RAGFlow supports deploying LLMs locally using Ollama or Xinference. |
|
|
|
## Ollama |
|
|
|
One-click deployment of local LLMs, that is [Ollama](https://github.com/ollama/ollama). |
|
|
|
### Install |
|
|
|
- [Ollama on Linux](https://github.com/ollama/ollama/blob/main/docs/linux.md) |
|
- [Ollama Windows Preview](https://github.com/ollama/ollama/blob/main/docs/windows.md) |
|
- [Docker](https://hub.docker.com/r/ollama/ollama) |
|
|
|
### Launch Ollama |
|
|
|
Decide which LLM you want to deploy ([here's a list for supported LLM](https://ollama.com/library)), say, **mistral**: |
|
```bash |
|
$ ollama run mistral |
|
``` |
|
Or, |
|
```bash |
|
$ docker exec -it ollama ollama run mistral |
|
``` |
|
|
|
### Use Ollama in RAGFlow |
|
|
|
- Go to 'Settings > Model Providers > Models to be added > Ollama'. |
|
|
|
 |
|
|
|
> Base URL: Enter the base URL where the Ollama service is accessible, like, `http://<your-ollama-endpoint-domain>:11434`. |
|
|
|
- Use Ollama Models. |
|
|
|
 |
|
|
|
## Xinference |
|
|
|
Xorbits Inference([Xinference](https://github.com/xorbitsai/inference)) empowers you to unleash the full potential of cutting-edge AI models. |
|
|
|
### Install |
|
|
|
- [pip install "xinference[all]"](https://inference.readthedocs.io/en/latest/getting_started/installation.html) |
|
- [Docker](https://inference.readthedocs.io/en/latest/getting_started/using_docker_image.html) |
|
|
|
To start a local instance of Xinference, run the following command: |
|
```bash |
|
$ xinference-local --host 0.0.0.0 --port 9997 |
|
``` |
|
### Launch Xinference |
|
|
|
Decide which LLM you want to deploy ([here's a list for supported LLM](https://inference.readthedocs.io/en/latest/models/builtin/)), say, **mistral**. |
|
Execute the following command to launch the model, remember to replace `${quantization}` with your chosen quantization method from the options listed above: |
|
```bash |
|
$ xinference launch -u mistral --model-name mistral-v0.1 --size-in-billions 7 --model-format pytorch --quantization ${quantization} |
|
``` |
|
|
|
### Use Xinference in RAGFlow |
|
|
|
- Go to 'Settings > Model Providers > Models to be added > Xinference'. |
|
|
|
 |
|
|
|
> Base URL: Enter the base URL where the Xinference service is accessible, like, `http://<your-xinference-endpoint-domain>:9997/v1`. |
|
|
|
- Use Xinference Models. |
|
|
|
 |
|
 |