### What problem does this PR solve?
1. fix: mid map show error in knowledge graph, juse because
```@antv/g6```version changed
2. feat: concurrent threads configuration support in graph extractor
3. fix: used tokens update failed for tenant
4. feat: timeout configuration support for llm
5. fix: regex error in graph extractor
6. feat: qwen rerank(```gte-rerank```) support
7. fix: timeout deal in knowledge graph index process. Now chat by
stream output, also, it is configuratable.
8. feat: ```qwen-long``` model configuration
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: chongchuanbing <chongchuanbing@gmail.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Resolve#2905
due to the in-consistent of token size, I make it safe to limit 500 in
code, since there is no config param to control
my llama.cpp run set -ub to 1024:
${llama_path}/bin/llama-server --host 0.0.0.0 --port 9901 -ub 1024 -ngl
99 -m $gguf_file --reranking "$@"
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Here is my test Ragflow use llama.cpp
```
lot update_slots: id 0 | task 458 | prompt done, n_past = 416, n_tokens = 416
slot release: id 0 | task 458 | stop processing: n_past = 416, truncated = 0
slot launch_slot_: id 0 | task 459 | processing task
slot update_slots: id 0 | task 459 | tokenizing prompt, len = 2
slot update_slots: id 0 | task 459 | prompt tokenized, n_ctx_slot = 8192, n_keep = 0, n_prompt_tokens = 111
slot update_slots: id 0 | task 459 | kv cache rm [0, end)
slot update_slots: id 0 | task 459 | prompt processing progress, n_past = 111, n_tokens = 111, progress = 1.000000
slot update_slots: id 0 | task 459 | prompt done, n_past = 111, n_tokens = 111
slot release: id 0 | task 459 | stop processing: n_past = 111, truncated = 0
srv update_slots: all slots are idle
request: POST /rerank 172.23.0.4 200
```
### What problem does this PR solve?
Fix keys of Xinference deployed models, especially has the same model
name with public hosted models.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: 0000sir <0000sir@gmail.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Fix: renrank_model and pdf_parser bugs | Update: session API
#2575#2559
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
---------
Co-authored-by: liuhua <10215101452@stu.ecun.edu.cn>
### What problem does this PR solve?
#2295
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
#1853#2138 add support for Voyage AI
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Zhedong Cen <cenzhedong2@126.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
add support for Baidu yiyan
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Zhedong Cen <cenzhedong2@126.com>
### What problem does this PR solve?
#1853 add support for SILICONFLOW
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Zhedong Cen <cenzhedong2@126.com>
### What problem does this PR solve?
#1853 add support for TogetherAI
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Zhedong Cen <cenzhedong2@126.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Zhedong Cen <cenzhedong2@126.com>
### What problem does this PR solve?
#1771 add supprot for OpenAI-API-Compatible
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Zhedong Cen <cenzhedong2@126.com>
### What problem does this PR solve?
fix api reference empty bug
```
for chunk_i in answer['reference'].get('chunks',[]):
^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'list' object has no attribute 'get'
```
```
return np.array([d["relevance_score"] for d in res["results"]]), res["meta"]["tokens"]["input_tokens"]+res["meta"]["tokens"]["output_tokens"]
~~~^^^^^^^^^^^
KeyError: 'results'
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
#1602
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Zhedong Cen <cenzhedong2@126.com>
### What problem does this PR solve?
add support for NVIDIA llm
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Zhedong Cen <cenzhedong2@126.com>
### What problem does this PR solve?
#762
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Zhedong Cen <cenzhedong2@126.com>
### What problem does this PR solve?
fix the tokens error that occurred when adding the xinference model
#1522
root@pc-gpu-86-41:~# curl -X 'POST' 'http://127.0.0.1:9997/v1/rerank' -H
'accept: application/json' -H 'Content-Type: application/json' -d '{
"model": "bge-reranker-v2-m3",
"query": "A man is eating pasta.",
"return_documents":"true",
"return_len":"true",
"documents": [
"A man is eating food.",
"A man is eating a piece of bread.",
"The girl is carrying a baby.",
"A man is riding a horse.",
"A woman is playing violin."
]
}'
{"id":"610a8724-3e96-11ef-81ce-08bfb886c012","results":[{"index":0,"relevance_score":0.999574601650238,"document":{"text":"A
man is eating
food."}},{"index":1,"relevance_score":0.07814773917198181,"document":{"text":"A
man is eating a piece of
bread."}},{"index":3,"relevance_score":0.000017700713215162978,"document":{"text":"A
man is riding a
horse."}},{"index":2,"relevance_score":0.0000163753629749408,"document":{"text":"The
girl is carrying a
baby."}},{"index":4,"relevance_score":0.00001631895975151565,"document":{"text":"A
woman is playing
violin."}}],"meta":{"api_version":null,"billed_units":null,"tokens":{"input_tokens":38,"output_tokens":38},"warnings":null}}
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
support xinference rerank model
#1455
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix ragflow may encounter an OOM (Out Of Memory) when there are a lot of
conversations.
#1288
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: zhuhao <zhuhao@linklogis.com>
### What problem does this PR solve?
issue #991
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: KevinHuSh <kevinhu.sh@gmail.com>
### What problem does this PR solve?
#724#162
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
feat: add rerank models to the project #724#162
### Type of change
- [x] New Feature (non-breaking change which adds functionality)