### What problem does this PR solve?
Add GPUStack as a new model provider.
[GPUStack](https://github.com/gpustack/gpustack) is an open-source GPU
cluster manager for running LLMs. Currently, locally deployed models in
GPUStack cannot integrate well with RAGFlow. GPUStack provides both
OpenAI compatible APIs (Models / Chat Completions / Embeddings /
Speech2Text / TTS) and other APIs like Rerank. We would like to use
GPUStack as a model provider in ragflow.
[GPUStack Docs](https://docs.gpustack.ai/latest/quickstart/)
Related issue: https://github.com/infiniflow/ragflow/issues/4064.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### Testing Instructions
1. Install GPUStack and deploy the `llama-3.2-1b-instruct` llm, `bge-m3`
text embedding model, `bge-reranker-v2-m3` rerank model,
`faster-whisper-medium` Speech-to-Text model, `cosyvoice-300m-sft` in
GPUStack.
2. Add provider in ragflow settings.
3. Testing in ragflow.
### What problem does this PR solve?
add new models in jinna connector, to allow use models that support
multilingual models
### Type of change
- [X] Other (please describe): new connectors no breaking change
### What problem does this PR solve?
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
When calling the Qwen rerank model, if the model does not return
correctly, an exception should be raised to notify the user, rather than
simply returning a value of 0, as this would be confusing to the user.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Hi there!
LocalAI added support of rerank models
https://localai.io/features/reranker/
I've implemented LocalAIRerank class (typically copied it from
OpenAI_APIRerank class).
Also, LocalAI model response with 500 error code if len of "documents"
is less than 2 in similarity check.
So I've added the second "document" on RERANK model connection check in
`api/apps/llm_app.py`.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
1. Module init won't connect database any more.
2. Config in settings need to be used with settings.CONFIG_NAME
### Type of change
- [x] Refactoring
Signed-off-by: jinhai <haijin.chn@gmail.com>
### What problem does this PR solve?
Use consistent log file names, introduced initLogger
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [x] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
When model’s group name contains 0-9,we can't find downloaded
model,because we do not correctly exstract model dir's name from model‘s
full name
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: 王志鹏 <zhipeng3.wang@midea.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
1. fix: mid map show error in knowledge graph, juse because
```@antv/g6```version changed
2. feat: concurrent threads configuration support in graph extractor
3. fix: used tokens update failed for tenant
4. feat: timeout configuration support for llm
5. fix: regex error in graph extractor
6. feat: qwen rerank(```gte-rerank```) support
7. fix: timeout deal in knowledge graph index process. Now chat by
stream output, also, it is configuratable.
8. feat: ```qwen-long``` model configuration
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: chongchuanbing <chongchuanbing@gmail.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Resolve#2905
due to the in-consistent of token size, I make it safe to limit 500 in
code, since there is no config param to control
my llama.cpp run set -ub to 1024:
${llama_path}/bin/llama-server --host 0.0.0.0 --port 9901 -ub 1024 -ngl
99 -m $gguf_file --reranking "$@"
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Here is my test Ragflow use llama.cpp
```
lot update_slots: id 0 | task 458 | prompt done, n_past = 416, n_tokens = 416
slot release: id 0 | task 458 | stop processing: n_past = 416, truncated = 0
slot launch_slot_: id 0 | task 459 | processing task
slot update_slots: id 0 | task 459 | tokenizing prompt, len = 2
slot update_slots: id 0 | task 459 | prompt tokenized, n_ctx_slot = 8192, n_keep = 0, n_prompt_tokens = 111
slot update_slots: id 0 | task 459 | kv cache rm [0, end)
slot update_slots: id 0 | task 459 | prompt processing progress, n_past = 111, n_tokens = 111, progress = 1.000000
slot update_slots: id 0 | task 459 | prompt done, n_past = 111, n_tokens = 111
slot release: id 0 | task 459 | stop processing: n_past = 111, truncated = 0
srv update_slots: all slots are idle
request: POST /rerank 172.23.0.4 200
```
### What problem does this PR solve?
Fix keys of Xinference deployed models, especially has the same model
name with public hosted models.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: 0000sir <0000sir@gmail.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Fix: renrank_model and pdf_parser bugs | Update: session API
#2575#2559
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
---------
Co-authored-by: liuhua <10215101452@stu.ecun.edu.cn>
### What problem does this PR solve?
#2295
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
#1853#2138 add support for Voyage AI
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Zhedong Cen <cenzhedong2@126.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
add support for Baidu yiyan
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Zhedong Cen <cenzhedong2@126.com>
### What problem does this PR solve?
#1853 add support for SILICONFLOW
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Zhedong Cen <cenzhedong2@126.com>
### What problem does this PR solve?
#1853 add support for TogetherAI
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Zhedong Cen <cenzhedong2@126.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Zhedong Cen <cenzhedong2@126.com>
### What problem does this PR solve?
#1771 add supprot for OpenAI-API-Compatible
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Zhedong Cen <cenzhedong2@126.com>
### What problem does this PR solve?
fix api reference empty bug
```
for chunk_i in answer['reference'].get('chunks',[]):
^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'list' object has no attribute 'get'
```
```
return np.array([d["relevance_score"] for d in res["results"]]), res["meta"]["tokens"]["input_tokens"]+res["meta"]["tokens"]["output_tokens"]
~~~^^^^^^^^^^^
KeyError: 'results'
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
#1602
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Zhedong Cen <cenzhedong2@126.com>
### What problem does this PR solve?
add support for NVIDIA llm
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Zhedong Cen <cenzhedong2@126.com>
### What problem does this PR solve?
#762
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Zhedong Cen <cenzhedong2@126.com>
### What problem does this PR solve?
fix the tokens error that occurred when adding the xinference model
#1522
root@pc-gpu-86-41:~# curl -X 'POST' 'http://127.0.0.1:9997/v1/rerank' -H
'accept: application/json' -H 'Content-Type: application/json' -d '{
"model": "bge-reranker-v2-m3",
"query": "A man is eating pasta.",
"return_documents":"true",
"return_len":"true",
"documents": [
"A man is eating food.",
"A man is eating a piece of bread.",
"The girl is carrying a baby.",
"A man is riding a horse.",
"A woman is playing violin."
]
}'
{"id":"610a8724-3e96-11ef-81ce-08bfb886c012","results":[{"index":0,"relevance_score":0.999574601650238,"document":{"text":"A
man is eating
food."}},{"index":1,"relevance_score":0.07814773917198181,"document":{"text":"A
man is eating a piece of
bread."}},{"index":3,"relevance_score":0.000017700713215162978,"document":{"text":"A
man is riding a
horse."}},{"index":2,"relevance_score":0.0000163753629749408,"document":{"text":"The
girl is carrying a
baby."}},{"index":4,"relevance_score":0.00001631895975151565,"document":{"text":"A
woman is playing
violin."}}],"meta":{"api_version":null,"billed_units":null,"tokens":{"input_tokens":38,"output_tokens":38},"warnings":null}}
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
support xinference rerank model
#1455
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix ragflow may encounter an OOM (Out Of Memory) when there are a lot of
conversations.
#1288
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: zhuhao <zhuhao@linklogis.com>
### What problem does this PR solve?
issue #991
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: KevinHuSh <kevinhu.sh@gmail.com>
### What problem does this PR solve?
#724#162
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)