Improved ollama doc (#3787)

### What problem does this PR solve?

Improved ollama doc. Close #3723

### Type of change

- [x] Documentation Update
This commit is contained in:
Zhichang Yu 2024-12-02 17:28:30 +08:00 committed by GitHub
parent c5f13629af
commit 9d093547e8
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -17,7 +17,7 @@ RAGFlow seamlessly integrates with Ollama and Xinference, without the need for f
This user guide does not intend to cover much of the installation or configuration details of Ollama or Xinference; its focus is on configurations inside RAGFlow. For the most current information, you may need to check out the official site of Ollama or Xinference. This user guide does not intend to cover much of the installation or configuration details of Ollama or Xinference; its focus is on configurations inside RAGFlow. For the most current information, you may need to check out the official site of Ollama or Xinference.
::: :::
## Deploy a local model using Ollama ## Deploy local models using Ollama
[Ollama](https://github.com/ollama/ollama) enables you to run open-source large language models that you deployed locally. It bundles model weights, configurations, and data into a single package, defined by a Modelfile, and optimizes setup and configurations, including GPU usage. [Ollama](https://github.com/ollama/ollama) enables you to run open-source large language models that you deployed locally. It bundles model weights, configurations, and data into a single package, defined by a Modelfile, and optimizes setup and configurations, including GPU usage.
@ -27,35 +27,54 @@ This user guide does not intend to cover much of the installation or configurati
- For a complete list of supported models and variants, see the [Ollama model library](https://ollama.com/library). - For a complete list of supported models and variants, see the [Ollama model library](https://ollama.com/library).
::: :::
To deploy a local model, e.g., **Llama3**, using Ollama: ### 1. Deploy ollama using docker
### 1. Check firewall settings
Ensure that your host machine's firewall allows inbound connections on port 11434. For example:
```bash ```bash
sudo ufw allow 11434/tcp sudo docker run --name ollama -p 11434:11434 ollama/ollama
time=2024-12-02T02:20:21.360Z level=INFO source=routes.go:1248 msg="Listening on [::]:11434 (version 0.4.6)"
time=2024-12-02T02:20:21.360Z level=INFO source=common.go:49 msg="Dynamic LLM libraries" runners="[cpu cpu_avx cpu_avx2 cuda_v11 cuda_v12]"
``` ```
Ensure ollama is listening on all IP address:
```bash
sudo ss -tunlp|grep 11434
tcp LISTEN 0 4096 0.0.0.0:11434 0.0.0.0:* users:(("docker-proxy",pid=794507,fd=4))
tcp LISTEN 0 4096 [::]:11434 [::]:* users:(("docker-proxy",pid=794513,fd=4))
```
Pull models as you need. It's recommended to start with `llama3.2` (a 3B chat model) and `bge-m3` (a 567M embedding model):
```bash
sudo docker exec ollama ollama pull llama3.2
pulling dde5aa3fc5ff... 100% ▕████████████████▏ 2.0 GB
success
```
```bash
sudo docker exec ollama ollama pull bge-m3
pulling daec91ffb5dd... 100% ▕████████████████▏ 1.2 GB
success
```
### 2. Ensure Ollama is accessible ### 2. Ensure Ollama is accessible
Restart system and use curl or your web browser to check if the service URL of your Ollama service at `http://localhost:11434` is accessible. If RAGFlow runs in Docker and Ollama runs on the same host machine, check if ollama is accessiable from inside the RAGFlow container:
```bash ```bash
sudo docker exec -it ragflow-server bash
root@8136b8c3e914:/ragflow# curl http://host.docker.internal:11434/
Ollama is running Ollama is running
``` ```
### 3. Run your local model If RAGFlow runs from source code and Ollama runs on the same host machine, check if ollama is accessiable from RAGFlow host machine:
```bash ```bash
ollama run llama3 curl http://localhost:11434/
Ollama is running
``` ```
<details>
<summary>If your Ollama is installed through Docker, run the following instead:</summary>
```bash If RAGFlow and Ollama run on different machines, check if ollama is accessiable from RAGFlow host machine:
docker exec -it ollama ollama run llama3 ```bash
``` curl http://${IP_OF_OLLAMA_MACHINE}:11434/
</details> Ollama is running
```
### 4. Add Ollama ### 4. Add Ollama
@ -68,26 +87,10 @@ In RAGFlow, click on your logo on the top right of the page **>** **Model Provid
In the popup window, complete basic settings for Ollama: In the popup window, complete basic settings for Ollama:
1. Because **llama3** is a chat model, choose **chat** as the model type. 1. Ensure model name and type match those been pulled at step 1, For example, (`llama3.2`, `chat`), (`bge-m3`, `embedding`).
2. Ensure that the model name you enter here *precisely* matches the name of the local model you are running with Ollama. 2. Ensure that the base URL match which been determined at step 2.
3. Ensure that the base URL you enter is accessible to RAGFlow. 3. OPTIONAL: Switch on the toggle under **Does it support Vision?** if your model includes an image-to-text model.
4. OPTIONAL: Switch on the toggle under **Does it support Vision?** if your model includes an image-to-text model.
:::caution NOTE
- If RAGFlow is in Docker and Ollama runs on the same host machine, use `http://host.docker.internal:11434` as base URL.
- If your Ollama and RAGFlow run on the same machine, use `http://localhost:11434` as base URL.
- If your Ollama runs on a different machine from RAGFlow, use `http://<IP_OF_OLLAMA_MACHINE>:11434` as base URL.
:::
:::danger WARNING
If your Ollama runs on a different machine, you may also need to set the `OLLAMA_HOST` environment variable to `0.0.0.0` in **ollama.service** (Note that this is *NOT* the base URL):
```bash
Environment="OLLAMA_HOST=0.0.0.0"
```
See [this guide](https://github.com/ollama/ollama/blob/main/docs/faq.md#how-do-i-configure-ollama-server) for more information.
:::
:::caution WARNING :::caution WARNING
Improper base URL settings will trigger the following error: Improper base URL settings will trigger the following error:
@ -100,7 +103,7 @@ Max retries exceeded with url: /api/chat (Caused by NewConnectionError('<urllib3
Click on your logo **>** **Model Providers** **>** **System Model Settings** to update your model: Click on your logo **>** **Model Providers** **>** **System Model Settings** to update your model:
*You should now be able to find **llama3** from the dropdown list under **Chat model**.* *You should now be able to find **llama3.2** from the dropdown list under **Chat model**, and **bge-m3** from the dropdown list under **Embedding model**.*
> If your local model is an embedding model, you should find your local model under **Embedding model**. > If your local model is an embedding model, you should find your local model under **Embedding model**.