add using jina deploy local llm in deploy_local_llm.mdx (#1872)

### What problem does this PR solve?

add using jina deploy local llm in deploy_local_llm.mdx

### Type of change

- [x] Documentation Update

---------

Co-authored-by: Zhedong Cen <cenzhedong2@126.com>
This commit is contained in:
黄腾 2024-08-09 15:24:09 +08:00 committed by GitHub
parent 8779aa1986
commit 44184d12a8
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -15,6 +15,40 @@ RAGFlow seamlessly integrates with Ollama and Xinference, without the need for f
This user guide does not intend to cover much of the installation or configuration details of Ollama or Xinference; its focus is on configurations inside RAGFlow. For the most current information, you may need to check out the official site of Ollama or Xinference. This user guide does not intend to cover much of the installation or configuration details of Ollama or Xinference; its focus is on configurations inside RAGFlow. For the most current information, you may need to check out the official site of Ollama or Xinference.
::: :::
# Deploy a local model using jina
[Jina](https://github.com/jina-ai/jina) lets you build AI services and pipelines that communicate via gRPC, HTTP and WebSockets, then scale them up and deploy to production.
To deploy a local model, e.g., **gpt2**, using Jina:
### 1. Check firewall settings
Ensure that your host machine's firewall allows inbound connections on port 12345.
```bash
sudo ufw allow 12345/tcp
```
### 2.install jina package
```bash
pip install jina
```
### 3. deployment local model
Step 1: Navigate to the rag/svr directory.
```bash
cd rag/svr
```
Step 2: Use Python to run the jina_server.py script and pass in the model name or the local path of the model (the script only supports loading models downloaded from Huggingface)
```bash
python jina_server.py --model_name gpt2
```
## Deploy a local model using Ollama ## Deploy a local model using Ollama
[Ollama](https://github.com/ollama/ollama) enables you to run open-source large language models that you deployed locally. It bundles model weights, configurations, and data into a single package, defined by a Modelfile, and optimizes setup and configurations, including GPU usage. [Ollama](https://github.com/ollama/ollama) enables you to run open-source large language models that you deployed locally. It bundles model weights, configurations, and data into a single package, defined by a Modelfile, and optimizes setup and configurations, including GPU usage.