### What problem does this PR solve?
Move the tiktoken of cl100k_base into docker image
issue: #3338
### Type of change
- [x] Refactoring
Signed-off-by: jinhai <haijin.chn@gmail.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Use consistent log file names, introduced initLogger
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [x] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Added doc for switching elasticsearch to infinity
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
### What problem does this PR solve?
This pull request fixes the issue mentioned in
https://github.com/infiniflow/ragflow/issues/3263.
1. response should be parsed as dict, prevent the following code from
failing to take values:
ans = response["content"][0]["text"]
2. API Model ```claude-instant-1.2``` has retired (by
[model-deprecations](https://docs.anthropic.com/en/docs/resources/model-deprecations)),
it will trigger errors in the code, so I deleted it from the
conf/llm_factories.json file and updated the latest API Model
```claude-3-5-sonnet-20241022```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: chenhaodong <chenhaodong@ctrlvideo.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
When model’s group name contains 0-9,we can't find downloaded
model,because we do not correctly exstract model dir's name from model‘s
full name
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: 王志鹏 <zhipeng3.wang@midea.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Integration with Infinity
- Replaced ELASTICSEARCH with dataStoreConn
- Renamed deleteByQuery with delete
- Renamed bulk to upsertBulk
- getHighlight, getAggregation
- Fix KGSearch.search
- Moved Dealer.sql_retrieval to es_conn.py
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Replaced redis with Valkey. Close#3070
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [x] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
the details is shown in
https://github.com/infiniflow/ragflow/issues/2968
### Type of change
- [X] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
#2970
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
[Bug]: When use OpenAI chat model , raise ERROR: 'CompletionUsage'
object has no attribute 'get' #2948
### What problem does this PR solve?
the detail of this PR is shown at
https://github.com/infiniflow/ragflow/issues/2948
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
1. fix: mid map show error in knowledge graph, juse because
```@antv/g6```version changed
2. feat: concurrent threads configuration support in graph extractor
3. fix: used tokens update failed for tenant
4. feat: timeout configuration support for llm
5. fix: regex error in graph extractor
6. feat: qwen rerank(```gte-rerank```) support
7. fix: timeout deal in knowledge graph index process. Now chat by
stream output, also, it is configuratable.
8. feat: ```qwen-long``` model configuration
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: chongchuanbing <chongchuanbing@gmail.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Resolve#2905
due to the in-consistent of token size, I make it safe to limit 500 in
code, since there is no config param to control
my llama.cpp run set -ub to 1024:
${llama_path}/bin/llama-server --host 0.0.0.0 --port 9901 -ub 1024 -ngl
99 -m $gguf_file --reranking "$@"
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Here is my test Ragflow use llama.cpp
```
lot update_slots: id 0 | task 458 | prompt done, n_past = 416, n_tokens = 416
slot release: id 0 | task 458 | stop processing: n_past = 416, truncated = 0
slot launch_slot_: id 0 | task 459 | processing task
slot update_slots: id 0 | task 459 | tokenizing prompt, len = 2
slot update_slots: id 0 | task 459 | prompt tokenized, n_ctx_slot = 8192, n_keep = 0, n_prompt_tokens = 111
slot update_slots: id 0 | task 459 | kv cache rm [0, end)
slot update_slots: id 0 | task 459 | prompt processing progress, n_past = 111, n_tokens = 111, progress = 1.000000
slot update_slots: id 0 | task 459 | prompt done, n_past = 111, n_tokens = 111
slot release: id 0 | task 459 | stop processing: n_past = 111, truncated = 0
srv update_slots: all slots are idle
request: POST /rerank 172.23.0.4 200
```
### What problem does this PR solve?
Fix keys of Xinference deployed models, especially has the same model
name with public hosted models.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: 0000sir <0000sir@gmail.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
#2701#2712#2749
### Type of change
-[x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>