ragflow/docs/xinference.md
ooo oo 111501af5e
make <xxxx> visiable (#369)
### What problem does this PR solve?


![image](https://github.com/infiniflow/ragflow/assets/106524776/0c526a56-05b1-42f8-8bf5-cb23a97183b8)

make `<xxxx>` visiable
it was misinterpreted as part of the HTML tags

![image](https://github.com/infiniflow/ragflow/assets/106524776/1c42aef0-6989-40c1-b129-47a835b038a7)

Issue link:None

### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Breaking Change (fix or feature that could cause existing
functionality not to work as expected)
- [x] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Test cases
- [ ] Python SDK impacted, Need to update PyPI
- [ ] Other (please describe):
2024-04-16 09:47:57 +08:00

2.0 KiB

Xinference

Xorbits Inference(Xinference) empowers you to unleash the full potential of cutting-edge AI models.

Install

To start a local instance of Xinference, run the following command:

$ xinference-local --host 0.0.0.0 --port 9997

Launch Xinference

Decide which LLM you want to deploy (here's a list for supported LLM), say, mistral. Execute the following command to launch the model, remember to replace ${quantization} with your chosen quantization method from the options listed above:

$ xinference launch -u mistral --model-name mistral-v0.1 --size-in-billions 7 --model-format pytorch --quantization ${quantization}

Use Xinference in RAGFlow

  • Go to 'Settings > Model Providers > Models to be added > Xinference'.

Base URL: Enter the base URL where the Xinference service is accessible, like, http://<your-xinference-endpoint-domain>:9997/v1.

  • Use Xinference Models.