566 Commits

Author SHA1 Message Date
Kevin Hu
b8da2eeb69
Feat: support huggingface re-rank model. (#5684)
### What problem does this PR solve?

#5658

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-03-06 10:44:04 +08:00
yihong
4326873af6
refactor: no need to inherit in python3 clean the code (#5659)
### What problem does this PR solve?

As title

### Type of change


- [x] Refactoring

Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2025-03-05 18:03:53 +08:00
Kevin Hu
e5041749a2
Fix: tavily search error. (#5653)
### What problem does this PR solve?

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-05 17:03:05 +08:00
Zhichang Yu
f65c3ae62b
Refactored DocumentService.update_progress (#5642)
### What problem does this PR solve?

Refactored DocumentService.update_progress

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-05 14:48:03 +08:00
hy89
b0c21b00d9
Refactor: Optimize error handling and support parsing of XLS(EXCEL97—2003) files. (#5633)
Optimize error handling and support parsing of XLS(EXCEL97—2003) files.
2025-03-05 11:55:27 +08:00
汪威
76e8285904
use to_df replace to_pl when get infinity Result (#5604)
### What problem does this PR solve?

_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change

- [x] Performance Improvement

---------

Co-authored-by: wangwei <dwxiayi@163.com>
2025-03-05 09:35:40 +08:00
Zhichang Yu
4d6484b03e
Fix nursery.start_soon. Close #5575 (#5591)
### What problem does this PR solve?

Fix nursery.start_soon. Close #5575

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-04 14:46:54 +08:00
Zhichang Yu
c813c1ff4c
Made task_executor async to speedup parsing (#5530)
### What problem does this PR solve?

Made task_executor async to speedup parsing

### Type of change

- [x] Performance Improvement
2025-03-03 18:59:49 +08:00
Kevin Hu
c190086707
Fix: bad case for tokenizer. (#5543)
### What problem does this PR solve?

#5492

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-03 15:36:16 +08:00
yihong
8a2542157f
Fix: possible memory leaks close #5277 (#5500)
### What problem does this PR solve?

close #5277 by make sure the file close

### Type of change

- [x] Performance Improvement

---------

Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2025-03-03 10:26:45 +08:00
Kevin Hu
21943ce0e2
Refine error message while embedding model error, (#5490)
### What problem does this PR solve?

### Type of change

- [x] Refactoring
2025-02-28 17:52:38 +08:00
Kevin Hu
b418ce5643
Fix table parser issue. (#5482)
### What problem does this PR solve?

#1475
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-02-28 16:09:12 +08:00
The Wind
85924e898e
Fix: enhance aliyun oss access with adding prefix path (#5475)
### What problem does this PR solve?

Enhance aliyun oss access with adding prefix path.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-02-28 15:00:00 +08:00
yihong
622b72db4b
Fix: add ctrl+c signal for better exit (#5469)
### What problem does this PR solve?

This patch add signal for ctrl + c that can exit the code friendly
cause code base use thread daemon can not exit friendly for being
started.

how to reproduce
1. docker-compose -f docker/docker-compose-base.yml up
2. other window `bash docker/launch_backend_service.sh`
3. stop 1 first
4. try to stop 2 then two thread can not exit which must use `kill pid`

This patch fix it 
and should fix most the related issues in the `issues`

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2025-02-28 14:52:40 +08:00
hy89
651422127c
Feat: Accessing Alibaba Cloud OSS with Amazon S3 SDK (#5438)
Accessing Alibaba Cloud OSS with Amazon S3 SDK
2025-02-27 17:02:42 +08:00
Kevin Hu
4c9a3e918f
Fix: add image2text issue. (#5431)
### What problem does this PR solve?

#5356

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-02-27 14:06:49 +08:00
Kevin Hu
fa76974e24
Fix issue of ask API. (#5400)
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-02-26 19:45:22 +08:00
Yongteng Lei
0284248c93
Fix: correct wrong vLLM rerank model (#5399)
### What problem does this PR solve?

Correct wrong vLLM rerank model #4316 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-02-26 18:59:36 +08:00
Yongteng Lei
cdcaae17c6
Feat: add VLLM (#5380)
### What problem does this PR solve?

Read to add VLMM.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-02-26 16:04:53 +08:00
Kevin Hu
96e9d50060
Let parallism of RAPTOR controlable. (#5379)
### What problem does this PR solve?

#4874
### Type of change

- [x] Refactoring
2025-02-26 15:58:06 +08:00
Kevin Hu
4f40f685d9
Code refactor (#5371)
### What problem does this PR solve?

#5173

### Type of change

- [x] Refactoring
2025-02-26 15:40:52 +08:00
Zhichang Yu
ffb4cda475
Run keyword_extraction, question_proposal, content_tagging in thread pool (#5376)
### What problem does this PR solve?

Run keyword_extraction, question_proposal, content_tagging in threads

### Type of change

- [x] Performance Improvement
2025-02-26 15:21:14 +08:00
Kevin Hu
4e2afcd3b8
Fix FlagRerank max_length issue. (#5366)
### What problem does this PR solve?

#5352

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-02-26 11:01:13 +08:00
Kevin Hu
53b9e7b52f
Add tavily as web searh tool. (#5349)
### What problem does this PR solve?

#5198

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-02-26 10:21:04 +08:00
Kevin Hu
955801db2e
Resolve super class invokation error. (#5337)
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-02-25 17:42:29 +08:00
Kevin Hu
daddfc9e1b
Remove dup gb2312, solve currupt error. (#5326)
### What problem does this PR solve?

#5252 
#5325

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-02-25 12:22:37 +08:00
Zhichang Yu
db42d0e0ae
Optimize ocr (#5297)
### What problem does this PR solve?

Introduced OCR.recognize_batch

### Type of change

- [x] Performance Improvement
2025-02-24 16:21:55 +08:00
Kevin Hu
df3d0f61bd
Fix base url missing for deepseek from Tongyi. (#5294)
### What problem does this PR solve?


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-02-24 15:43:32 +08:00
Kevin Hu
ec96426c00
Tongyi adapts deepseek. (#5285)
### What problem does this PR solve?


### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-02-24 14:04:25 +08:00
Kevin Hu
9aa222f738
Let list_chat go without kb checking. (#5280)
### What problem does this PR solve?

#5278 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-02-24 13:21:05 +08:00
liwenju0
569e40544d
Refactor rerank model with dynamic batch processing and memory manage… (#5273)
…ment

### What problem does this PR solve?
Issue:https://github.com/infiniflow/ragflow/issues/5262
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

Co-authored-by: wenju.li <wenju.li@deepctr.cn>
2025-02-24 11:32:08 +08:00
Omar Leonardo Sanchez Granados
4f2816c01c
Add support to boto3 default connection (#5246)
### What problem does this PR solve?
 
This pull request includes changes to the initialization logic of the
`ChatModel` and `EmbeddingModel` classes to enhance the handling of AWS
credentials.

Use cases:
- Use env variables for credentials instead of managing them on the DB 
- Easy connection when deploying on an AWS machine

### Type of change

- [X] New Feature (non-breaking change which adds functionality)
2025-02-24 11:01:14 +08:00
yrk111222
7ce675030b
Support downloading models from ModelScope Community. (#5073)
This PR supports downloading models from ModelScope. The main
modifications are as follows:
-New Feature (non-breaking change which adds functionality)
-Documentation Update

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-02-24 10:12:20 +08:00
Zhichang Yu
d78010c376
Fixed similarity on infinity (#5236)
### What problem does this PR solve?

Fixed similarity on infinity

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-02-21 18:50:54 +08:00
Kevin Hu
3444cb15e3
Refine search query. (#5235)
### What problem does this PR solve?

#5173
#5214

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-02-21 18:32:32 +08:00
Kevin Hu
cdb3e6434a
Fix empty question issue. (#5225)
### What problem does this PR solve?

#5241

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-02-21 15:47:39 +08:00
Kevin Hu
1a755e75c5
Remove v1 (#5220)
### What problem does this PR solve?

#5201

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-02-21 15:15:38 +08:00
Kevin Hu
7b3d700d5f
Apply agentic searching. (#5196)
### What problem does this PR solve?

#5173

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-02-20 17:41:01 +08:00
Kevin Hu
e6c024f8bf
Fix too many clause while searching. (#5119)
### What problem does this PR solve?

#5100

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-02-19 13:18:39 +08:00
Kevin Hu
c28bc41a96
Fix docx table issue. (#5117)
### What problem does this PR solve?

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-02-19 12:40:06 +08:00
ubbg
29a59ed7e2
Fix: Use self.dataStore.indexExist in all_tags method of Dealer (#5108)
### What problem does this PR solve?

This PR fixes an AttributeError in the all_tags method of the Dealer
class. Previously, the method incorrectly called
self.docStoreConn.indexExist instead of self.dataStore.indexExist. Since
self.docStoreConn was never set (and self.dataStore is already
initialized in init), this resulted in an error when attempting to check
if the index exists. This change ensures that the proper connector is
used for the index existence check, thereby resolving the issue._

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-02-19 11:50:57 +08:00
Kevin Hu
9ff825f39d
Ignore exceptions when no index ahead. (#5047)
### What problem does this PR solve?

### Type of change

- [x] Refactoring
2025-02-18 09:09:22 +08:00
Kevin Hu
3aa5c2a699
Ignore exception of empty index. (#5030)
### What problem does this PR solve?

### Type of change


- [x] Refactoring
2025-02-17 15:59:55 +08:00
saikidev
d2929e432e
Feat: add LLM provider PPIO (#5013)
### What problem does this PR solve?

Add a LLM provider: PPIO

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
2025-02-17 12:03:26 +08:00
Kevin Hu
29ceeba95f
Fix hit cache error while raptoring. (#4955)
### What problem does this PR solve?

#4126

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-02-14 12:00:19 +08:00
Kevin Hu
b08bb56f6c
Display thinking for deepseek r1 (#4904)
### What problem does this PR solve?
#4903
### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-02-12 15:43:13 +08:00
Mathias Panzenböck
9bcccadebd
Remove use of eval() from search.py (#4887)
Use `json.loads()` instead.

### What problem does this PR solve?

Using `eval()` can lead to code injections. I think this loads a JSON
field, right? If yes, why is this done via `eval()` and not
`json.loads()`?

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-02-12 13:15:38 +08:00
Kevin Hu
0d3ed37b48
Make the update script shorter. (#4854)
### What problem does this PR solve?


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-02-10 18:18:49 +08:00
Kevin Hu
f374dd38b6
Fix divided by zero issue. (#4784)
### What problem does this PR solve?

#4779

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-02-08 10:36:26 +08:00
Kevin Hu
2aa0cdde8f
Fix Gemini chat issue. (#4757)
### What problem does this PR solve?

#4753

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-02-07 12:00:19 +08:00