Kevin Hu
ce1e855328
Upgrades Document Layout Analysis model. ( #4054 )
...
### What problem does this PR solve?
#4052
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
2024-12-17 11:27:19 +08:00
Kevin Hu
cb6e9ce164
Cache the result from llm for graphrag and raptor ( #4051 )
...
### What problem does this PR solve?
#4045
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
2024-12-17 09:48:03 +08:00
Kevin Hu
7fb67c4f67
Fix chunk number error after re-parsing. ( #4043 )
...
### What problem does this PR solve?
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2024-12-16 15:23:49 +08:00
Zhichang Yu
c8b1a564aa
Replaced md5 with xxhash64 for chunk id ( #4009 )
...
### What problem does this PR solve?
Replaced md5 with xxhash64 for chunk id
### Type of change
- [x] Refactoring
2024-12-12 17:47:39 +08:00
Zhichang Yu
301f95837c
Try to reuse existing chunks ( #3983 )
...
### What problem does this PR solve?
Try to reuse existing chunks. Close #3793
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
2024-12-12 16:38:03 +08:00
Zhichang Yu
0d68a6cd1b
Fix errors detected by Ruff ( #3918 )
...
### What problem does this PR solve?
Fix errors detected by Ruff
### Type of change
- [x] Refactoring
2024-12-08 14:21:12 +08:00
Kevin Hu
74b28ef1b0
Add pagerank to KB. ( #3809 )
...
### What problem does this PR solve?
#3794
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
2024-12-03 14:30:35 +08:00
Zhichang Yu
4ed5ca2666
handle_task catch all exception ( #3441 )
...
### What problem does this PR solve?
handle_task catch all exception
Report heartbeats
### Type of change
- [x] Refactoring
2024-11-15 18:51:09 +08:00
yqkcn
57237634f1
Refactoring large integers to improve readability ( #2636 )
...
### What problem does this PR solve?
Refactoring large integers
### Type of change
- [x] Refactoring
2024-09-29 10:17:42 +08:00
Fachuan Bai
8dd3adc443
Storage: Support the s3, azure blob as the object storage of ragflow. ( #2278 )
...
### What problem does this PR solve?
issue: https://github.com/infiniflow/ragflow/issues/2277
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2024-09-09 09:41:14 +08:00
Kevin Hu
fc1ac3a962
fix delete message error ( #2153 )
...
### What problem does this PR solve?
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2024-08-29 14:07:14 +08:00
Kevin Hu
212bb8e601
add retry count to task ( #2152 )
...
### What problem does this PR solve?
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
2024-08-29 13:31:41 +08:00
Jin Hai
6b3a40be5c
Format file format from Windows/dos to Unix ( #1949 )
...
### What problem does this PR solve?
Related source file is in Windows/DOS format, they are format to Unix
format.
### Type of change
- [x] Refactoring
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2024-08-15 09:17:36 +08:00
Kevin Hu
152072f900
Add graphrag ( #1793 )
...
### What problem does this PR solve?
#1594
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
2024-08-02 18:51:14 +08:00
KevinHuSh
2023fdc13e
fix file preview in file management ( #1151 )
...
### What problem does this PR solve?
fix file preview in file management
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2024-06-14 10:33:59 +08:00
KevinHuSh
6f99bbbb08
add raptor ( #899 )
...
### What problem does this PR solve?
#882
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
2024-05-23 14:31:16 +08:00
KevinHuSh
d8c080ee52
fix bugs in searching file using keywords ( #780 )
...
### What problem does this PR solve?
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2024-05-15 12:51:57 +08:00
KevinHuSh
7ddb2f19be
make sure to raise exception if redis is not there ( #674 )
...
### What problem does this PR solve?
### Type of change
- [x] Refactoring
2024-05-08 15:20:45 +08:00
KevinHuSh
8d6d7f6887
fix task losting isssue ( #665 )
...
### What problem does this PR solve?
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2024-05-07 20:46:45 +08:00
KevinHuSh
a5aed2412f
fix bugs ( #662 )
...
### What problem does this PR solve?
Fix import error for task_service.py
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2024-05-07 16:41:56 +08:00
Fakai Zhao
de839fc3f0
optimize srv broker and executor logic ( #630 )
...
### What problem does this PR solve?
Optimize task broker and executor for reduce memory usage and deployment
complexity.
### Type of change
- [x] Performance Improvement
- [x] Refactoring
### Change Log
- Enhance redis utils for message queue(use stream)
- Modify task broker logic via message queue (1.get parse event from
message queue 2.use ThreadPoolExecutor async executor )
- Modify the table column name of document and task (process_duation ->
process_duration maybe just a spelling mistake)
- Reformat some code style(just what i see)
- Add requirement_dev.txt for developer
- Add redis container on docker compose
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2024-05-07 11:43:33 +08:00
KevinHuSh
8c07992b6c
refine code ( #595 )
...
### What problem does this PR solve?
### Type of change
- [x] Refactoring
2024-04-28 19:13:33 +08:00
KevinHuSh
944776f207
fix bug about fetching file from minio ( #574 )
...
### What problem does this PR solve?
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
2024-04-28 09:57:40 +08:00
KevinHuSh
66f8d35632
Refactor ( #537 )
...
### What problem does this PR solve?
### Type of change
- [x] Refactoring
2024-04-25 14:14:28 +08:00
KevinHuSh
ed6081845a
Fit a lot of encodings for text file. ( #458 )
...
### What problem does this PR solve?
#384
### Type of change
- [x] Performance Improvement
2024-04-19 18:02:53 +08:00
KevinHuSh
890561703b
Add bce-embedding and fastembed ( #383 )
...
### What problem does this PR solve?
Issue link:#326
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
2024-04-16 16:42:19 +08:00
KevinHuSh
0feb085c88
refine table parser ( #120 )
2024-03-12 18:56:04 +08:00
KevinHuSh
602038ac49
fix task cancling bug ( #98 )
2024-03-05 16:33:47 +08:00
KevinHuSh
8a726fb04b
solve task execution issues ( #90 )
2024-03-01 19:48:01 +08:00
KevinHuSh
7fd1eca582
init README of deepdoc, add picture processer. ( #71 )
...
* init README of deepdoc, add picture processer.
* add resume parsing
2024-02-23 18:28:12 +08:00
KevinHuSh
407b2523b6
remove unused codes, seperate layout detection out as a new api. Add new rag methed 'table' ( #55 )
2024-02-05 18:08:17 +08:00
KevinHuSh
e6acaf6738
Add Q&A and Book, fix task running bugs ( #50 )
2024-02-01 18:53:56 +08:00
KevinHuSh
6224edcd1b
Add task moduel, and pipline the task and every parser ( #49 )
2024-01-31 19:57:45 +08:00