### What problem does this PR solve?
When using the online large model API knowledge base to extract
knowledge graphs, frequent Rate Limit Errors were triggered,
causing document parsing to fail. This commit fixes the issue by
optimizing API calls in the following way:
Added exponential backoff and jitter to the API call to reduce the
frequency of Rate Limit Errors.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Fix the error where the Ollama embeddings interface returns a “500
Internal Server Error” when using models such as xiaobu-embedding-v2 for
embedding.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Add vision LLM PDF parser
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
vLLM provider with a reranking model does not work : as vLLM uses under
the hood the [CoHereRerank
provider](https://github.com/infiniflow/ragflow/blob/v0.17.0/rag/llm/__init__.py#L250)
with a `base_url`, if this URL [is not passed to the Cohere
client](https://github.com/infiniflow/ragflow/blob/v0.17.0/rag/llm/rerank_model.py#L379-L382)
any attempt will endup on the Cohere SaaS (sending your private api key
in the process) instead of your vLLM instance.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
…ment
### What problem does this PR solve?
Issue:https://github.com/infiniflow/ragflow/issues/5262
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: wenju.li <wenju.li@deepctr.cn>
### What problem does this PR solve?
This pull request includes changes to the initialization logic of the
`ChatModel` and `EmbeddingModel` classes to enhance the handling of AWS
credentials.
Use cases:
- Use env variables for credentials instead of managing them on the DB
- Easy connection when deploying on an AWS machine
### Type of change
- [X] New Feature (non-breaking change which adds functionality)
This PR supports downloading models from ModelScope. The main
modifications are as follows:
-New Feature (non-breaking change which adds functionality)
-Documentation Update
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Add a LLM provider: PPIO
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
### What problem does this PR solve?
ERROR: 'Stream' object has no attribute 'iter_lines' with reference to
Claude/Anthropic chat streams
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: Kyle Olmstead <k.olmstead@offensive-security.com>
### What problem does this PR solve?
Add GPUStack as a new model provider.
[GPUStack](https://github.com/gpustack/gpustack) is an open-source GPU
cluster manager for running LLMs. Currently, locally deployed models in
GPUStack cannot integrate well with RAGFlow. GPUStack provides both
OpenAI compatible APIs (Models / Chat Completions / Embeddings /
Speech2Text / TTS) and other APIs like Rerank. We would like to use
GPUStack as a model provider in ragflow.
[GPUStack Docs](https://docs.gpustack.ai/latest/quickstart/)
Related issue: https://github.com/infiniflow/ragflow/issues/4064.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### Testing Instructions
1. Install GPUStack and deploy the `llama-3.2-1b-instruct` llm, `bge-m3`
text embedding model, `bge-reranker-v2-m3` rerank model,
`faster-whisper-medium` Speech-to-Text model, `cosyvoice-300m-sft` in
GPUStack.
2. Add provider in ragflow settings.
3. Testing in ragflow.
### What problem does this PR solve?
1. Change embedding model of knowledge base won't change the default
embedding model.
2. Retrieval test bug
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Signed-off-by: jinhai <haijin.chn@gmail.com>