146 Commits

Author SHA1 Message Date
Kevin Hu
f374dd38b6
Fix divided by zero issue. (#4784)
### What problem does this PR solve?

#4779

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-02-08 10:36:26 +08:00
Kevin Hu
448fa1c4d4
Robust for abnormal response from LLMs. (#4747)
### What problem does this PR solve?


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-02-06 17:34:53 +08:00
Kevin Hu
6f2c3a3c3c
Fix too long query exception. (#4729)
### What problem does this PR solve?


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-02-06 10:11:52 +08:00
Kevin Hu
4011c8f68c
Fix potential error. (#4650)
### What problem does this PR solve?
#4622

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-01-26 12:38:32 +08:00
Kevin Hu
86892959a0
Rebuild graph when it's out of time. (#4607)
### What problem does this PR solve?

#4543

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
2025-01-23 17:26:20 +08:00
Kevin Hu
dd0ebbea35
Light GraphRAG (#4585)
### What problem does this PR solve?

#4543

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-01-22 19:43:14 +08:00
Kevin Hu
c5da3cdd97
Tagging (#4426)
### What problem does this PR solve?

#4367

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-01-09 17:07:21 +08:00
Kevin Hu
d9a4e4cc3b
Fix page size error. (#4401)
### What problem does this PR solve?

#4400

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-01-07 19:06:31 +08:00
Kevin Hu
f948c0d9f1
Clean query. (#4259)
### What problem does this PR solve?

#4239

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-12-27 14:25:03 +08:00
Kevin Hu
7e063283ba
Removing invisible chars before tokenization. (#4233)
### What problem does this PR solve?

#4223

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-12-26 11:48:16 +08:00
Bo Liu
321e9f3719
fix: stop rerank by model when search result is empty (#4203)
### What problem does this PR solve?


stop rerank by model when search result is empty, otherwise rerank may
raise an error (qwen).

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

Co-authored-by: 刘博 <liubo@ynby.cn>
2024-12-24 14:33:46 +08:00
Kevin Hu
c373dba0bc
Fix raptor bug. (#4192)
### What problem does this PR solve?


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-12-23 18:59:48 +08:00
Kevin Hu
31d67c850e
Fetch chunk by batches. (#4177)
### What problem does this PR solve?

#4173

### Type of change

- [x] Performance Improvement
2024-12-23 12:12:15 +08:00
Jin Hai
50c2b9d562
Refactor trie load and construct (#4083)
### What problem does this PR solve?

1. Fix initial build and load trie
2. Update comment

### Type of change

- [x] Refactoring

Signed-off-by: jinhai <haijin.chn@gmail.com>
2024-12-18 12:52:56 +08:00
Kevin Hu
000cd6d615
Fix position lost issue. (#4068)
### What problem does this PR solve?

#4040

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-12-17 16:31:58 +08:00
Luo Pan
68d46b2a1e
Fix bug in hierarchical_merge function (#4006)
### What problem does this PR solve?

Fix hierarchical_merge function. From idx vs. actual value to actual
value vs. actual value.
Related issue #4003 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

Co-authored-by: luopan <luopan@example.com>
2024-12-13 08:50:58 +08:00
Zhichang Yu
03f00c9e6f
Rename page_num_list, top_list, position_list (#3940)
### What problem does this PR solve?

Rename page_num_list, top_list, position_list to page_num_int, top_int,
position_int

### Type of change

- [x] Refactoring
2024-12-10 16:32:58 +08:00
Kevin Hu
927873bfa6
Fix syn error. (#3953)
### What problem does this PR solve?

Close #3696
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-12-10 10:54:54 +08:00
Zhichang Yu
7a6bf4326e
Fixed log not displaying (#3946)
### What problem does this PR solve?

Fixed log not displaying

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-12-10 09:36:59 +08:00
Zhichang Yu
0d68a6cd1b
Fix errors detected by Ruff (#3918)
### What problem does this PR solve?

Fix errors detected by Ruff

### Type of change

- [x] Refactoring
2024-12-08 14:21:12 +08:00
Kevin Hu
56f473b680
Feat: Add question parameter to edit chunk modal (#3875)
### What problem does this PR solve?

Close #3873

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2024-12-05 14:51:19 +08:00
Kevin Hu
1b817a5b4c
Refine synonym query. (#3855)
### What problem does this PR solve?

### Type of change

- [x] Performance Improvement
2024-12-04 17:20:12 +08:00
Jin Hai
6657ca7cde
Change default error message to English (#3838)
### What problem does this PR solve?

As title

### Type of change

- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2024-12-04 09:34:49 +08:00
Kevin Hu
74b28ef1b0
Add pagerank to KB. (#3809)
### What problem does this PR solve?

#3794

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2024-12-03 14:30:35 +08:00
Kevin Hu
0f08b0f053
Weight up title and keywords for chunks in terms of retrieval (#3750)
### What problem does this PR solve?


### Type of change

- [x] Performance Improvement
2024-11-29 16:39:55 +08:00
Zhichang Yu
43e367f2ea
Detect shape error of embedding (#3710)
### What problem does this PR solve?

Detect shape error of embedding. Close #2997

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-11-28 14:10:22 +08:00
Zhichang Yu
bc701d7b4c
Edit chunk shall update instead of insert it (#3709)
### What problem does this PR solve?

Edit chunk shall update instead of insert it. Close #3679 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-11-28 13:00:38 +08:00
Kevin Hu
57208d8e53
Fix batch size issue. (#3675)
### What problem does this PR solve?

#3657

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-11-27 18:06:43 +08:00
liuhua
5c59651bda
Fix the bug causing garbled text (#3640)
### What problem does this PR solve?

Fix the bug causing garbled text #3613

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

Co-authored-by: liuhua <10215101452@stu.ecun.edu.cn>
2024-11-26 12:06:56 +08:00
Kevin Hu
9f3141804f
Fix chunk enable/disable issue (#3579)
### What problem does this PR solve?

#3576

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-11-22 12:25:42 +08:00
Zhichang Yu
cad341e794 Added kb_id filter to knn. Fix #3458 (#3513)
### What problem does this PR solve?

Added kb_id filter to knn. Fix #3458

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-11-20 20:53:30 +08:00
Kevin Hu
289034f36e smooth term weight (#3510)
### What problem does this PR solve?

#3499

### Type of change

- [x] Performance Improvement
2024-11-20 20:52:51 +08:00
Kevin Hu
17a7ea42eb fix synonym bug (#3506)
### What problem does this PR solve?

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-11-20 20:52:36 +08:00
Kung Quang
568322aeaf
fix(rag): fix error in viewing document chunk and cannot start task_executor server (#3481)
### What problem does this PR solve?

1. Fix error in viewing document chunk

<img width="1677" alt="Pasted Graphic"
src="https://github.com/user-attachments/assets/acd84cde-f38c-4190-b135-5e5139ae2613">

Viewing document chunk details in a BeartypeCallHintParamViolation
error.

Traceback (most recent call last):
File "ragflow/.venv/lib/python3.12/site-packages/flask/app.py", line
880, in full_dispatch_request
    rv = self.dispatch_request()
         ^^^^^^^^^^^^^^^^^^^^^^^
File "ragflow/.venv/lib/python3.12/site-packages/flask/app.py", line
865, in dispatch_request
return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
# type: ignore[no-any-return]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "ragflow/.venv/lib/python3.12/site-packages/flask_login/utils.py",
line 290, in decorated_view
    return current_app.ensure_sync(func)(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "ragflow/api/apps/chunk_app.py", line 311, in knowledge_graph
sres = settings.retrievaler.search(req, search.index_name(tenant_id),
kb_ids)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "<@beartype(rag.nlp.search.Dealer.search) at 0x3381fd800>", line
39, in search
beartype.roar.BeartypeCallHintParamViolation: Method
rag.nlp.search.Dealer.search() parameter
idx_names='ragflow_0e1e67f431d711ef98fc00155d29195d' violates type hint
list[str], as str 'ragflow_0e1e67f431d711ef98fc00155d29195d' not
instance of list.
2024-11-19 11:30:29,817 ERROR 91013 Method
rag.nlp.search.Dealer.search() parameter
idx_names='ragflow_0e1e67f431d711ef98fc00155d29195d' violates type hint
list[str], as str 'ragflow_0e1e67f431d711ef98fc00155d29195d' not
instance of list.
Traceback (most recent call last):
  File "ragflow/api/apps/chunk_app.py", line 60, in list_chunk
sres = settings.retrievaler.search(query, search.index_name(tenant_id),
kb_ids, highlight=True)

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "<@beartype(rag.nlp.search.Dealer.search) at 0x3381fd800>", line
39, in search
beartype.roar.BeartypeCallHintParamViolation: Method
rag.nlp.search.Dealer.search() parameter
idx_names='ragflow_0e1e67f431d711ef98fc00155d29195d' violates type hint
list[str], as str 'ragflow_0e1e67f431d711ef98fc00155d29195d' not
instance of list.


because in nlp/search.py,the idx_names is only list

<img width="1098" alt="Pasted Graphic 2"
src="https://github.com/user-attachments/assets/4998cb1e-94bc-470b-b2f4-41ecb5b08f8a">

but the DocStoreConnection.search method accept list or str
<img width="1175" alt="Pasted Graphic 3"
src="https://github.com/user-attachments/assets/ee918b4a-87a5-42c9-a6d2-d0db0884b875">


and his implements also list and str
es_conn.py

<img width="1121" alt="Pasted Graphic 4"
src="https://github.com/user-attachments/assets/3e6dc030-0a0d-416c-8fd4-0b4cfd576f8c">

infinity_conn.py

<img width="1221" alt="Pasted Graphic 5"
src="https://github.com/user-attachments/assets/44edac2b-6b81-45b0-a3fc-cb1c63219015">

2. Fix cannot star task_executor server with Unresolved reference
'Mapping'
<img width="1283" alt="Pasted Graphic 6"
src="https://github.com/user-attachments/assets/421f17b8-d0a5-46d3-bc4d-d05dc9dfc934">

### Type of change

- [X] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2024-11-19 14:36:10 +08:00
Zhichang Yu
dec9b3e540
Fix logs. Use dict.pop instead of del. Close #3473 (#3484)
### What problem does this PR solve?

Fix logs. Use dict.pop instead of del.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-11-19 14:15:25 +08:00
Zhichang Yu
4413683898
Introduced beartype (#3460)
### What problem does this PR solve?

Introduced [beartype](https://github.com/beartype/beartype) for runtime
type-checking.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2024-11-18 17:38:17 +08:00
Kevin Hu
cb3b9d7ada
refine the message of queuing a task (#3437)
### What problem does this PR solve?



### Type of change

- [x] Refactoring
2024-11-15 15:59:54 +08:00
Kevin Hu
ca9e97d2f2
Enlarge the term weight difference (#3435)
### What problem does this PR solve?


### Type of change

- [x] Performance Improvement
2024-11-15 15:41:50 +08:00
Kevin Hu
48e060aa53
rm es query escape chars (#3428)
### What problem does this PR solve?

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-11-15 13:19:07 +08:00
Kevin Hu
a1ba228bc2
fix: empty token bug (#3424)
### What problem does this PR solve?


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-11-15 10:33:03 +08:00
Kevin Hu
220aaddc62
fix: synonym bug (#3423)
### What problem does this PR solve?


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-11-15 10:14:51 +08:00
Zhichang Yu
30f6421760
Use consistent log file names, introduced initLogger (#3403)
### What problem does this PR solve?

Use consistent log file names, introduced initLogger

### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [x] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2024-11-14 17:13:48 +08:00
Kevin Hu
c5368c7745
resolve halt while starting up (#3397)
### What problem does this PR solve?


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-11-14 13:20:17 +08:00
Kevin Hu
91332fa0f8
Refine english synonym (#3371)
### What problem does this PR solve?

#3361

### Type of change

- [x] Performance Improvement
2024-11-13 12:58:37 +08:00
Zhichang Yu
a2a5631da4
Rework logging (#3358)
Unified all log files into one.

### What problem does this PR solve?

Unified all log files into one.

### Type of change

- [x] Refactoring
2024-11-12 17:35:13 +08:00
Zhichang Yu
f4c52371ab
Integration with Infinity (#2894)
### What problem does this PR solve?

Integration with Infinity

- Replaced ELASTICSEARCH with dataStoreConn
- Renamed deleteByQuery with delete
- Renamed bulk to upsertBulk
- getHighlight, getAggregation
- Fix KGSearch.search
- Moved Dealer.sql_retrieval to es_conn.py


### Type of change

- [x] Refactoring
2024-11-12 14:59:41 +08:00
Kevin Hu
004487cca0
fix term weight issue (#3306)
### What problem does this PR solve?

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-11-08 18:25:23 +08:00
Kevin Hu
8b6e272197
fix: term weight issue (#3294)
### What problem does this PR solve?



### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-11-08 15:49:44 +08:00
Kevin Hu
d88f0d43ea
make language judgement robuster (#3287)
### What problem does this PR solve?



### Type of change

- [x] Performance Improvement
2024-11-08 12:48:11 +08:00
Kevin Hu
fbcc0bb408
accelerate tokenize (#3244)
### What problem does this PR solve?


### Type of change

- [x] Performance Improvement
2024-11-06 18:54:41 +08:00