125 Commits

Author SHA1 Message Date
KevinHuSh
a12fcf9156
fix minio helth bug (#850)
### What problem does this PR solve?

#643 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-05-20 19:35:30 +08:00
GYH
c27c02ea67
Split Excel file into different chunks (#847)
### What problem does this PR solve?


Split Excel into different chunk
### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2024-05-20 18:35:15 +08:00
KevinHuSh
a7bd427116
add locally deployed llm (#841)
### What problem does this PR solve?


### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2024-05-20 12:40:59 +08:00
KevinHuSh
2b36283712
fix english query bug (#840)
### What problem does this PR solve?

#834 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-05-20 12:23:51 +08:00
KevinHuSh
f6a599461f
fix zhipuAI stream issue (#825)
### What problem does this PR solve?


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-05-17 17:07:33 +08:00
KevinHuSh
d1614107e2
fix stream chat for ollama (#816)
### What problem does this PR solve?

#709

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-05-17 12:07:00 +08:00
KevinHuSh
95f809187e
add stream chat (#811)
### What problem does this PR solve?

#709 
### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2024-05-16 20:14:53 +08:00
dashi6174
6ff63ee2ba
Support for code files parse (#789)
### What problem does this PR solve?

_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2024-05-15 16:34:28 +08:00
KevinHuSh
4d47b2b459
fix a string format error (#781)
### What problem does this PR solve?


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-05-15 13:02:31 +08:00
KevinHuSh
aa1c915d6e
support gpt-4o (#773)
### What problem does this PR solve?
#771 

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2024-05-15 11:16:08 +08:00
Jin Hai
77b1520b66
Refactor message output format (#772)
### What problem does this PR solve?

_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change

- [x] Refactoring

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2024-05-15 10:48:42 +08:00
KevinHuSh
ffe5737f7d
let index be batchly. (#733)
### What problem does this PR solve?

let index be batchly.

### Type of change


- [x] Refactoring
2024-05-11 19:47:53 +08:00
KevinHuSh
04a9e95161
let file in knowledgebases visible in file manager (#714)
### What problem does this PR solve?

Let file in knowledgebases visible in file manager.
#162 

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2024-05-11 16:04:28 +08:00
KevinHuSh
648a2baaa9
fix disabled doc is still retreivalable (#695)
### What problem does this PR solve?

Fix that disabled doc is still retreivalable

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-05-09 15:32:24 +08:00
KevinHuSh
4153a36683
truncate text to fitin embedding model (#692)
### What problem does this PR solve?


### Type of change

- [x] Refactoring
2024-05-09 11:35:08 +08:00
KevinHuSh
c28f7b5d38
make sure the error will be recorded. (#672)
### What problem does this PR solve?


### Type of change

- [x] Refactoring
2024-05-08 13:58:41 +08:00
KevinHuSh
eb27a4309e
add support for deepseek (#668)
### What problem does this PR solve?

#666 

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2024-05-08 10:30:02 +08:00
KevinHuSh
a6e4b74d94
remove unused dependency (#664)
### What problem does this PR solve?

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-05-07 19:46:17 +08:00
KevinHuSh
7013d7f620
refine text decode (#657)
### What problem does this PR solve?
#651 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-05-07 12:25:47 +08:00
Fakai Zhao
de839fc3f0
optimize srv broker and executor logic (#630)
### What problem does this PR solve?

Optimize task broker and executor for reduce memory usage and deployment
complexity.

### Type of change
- [x] Performance Improvement
- [x] Refactoring

### Change Log
- Enhance redis utils for message queue(use stream)
- Modify task broker logic via message queue (1.get parse event from
message queue 2.use ThreadPoolExecutor async executor )
- Modify the table column name of document and task (process_duation ->
process_duration maybe just a spelling mistake)
- Reformat some code style(just what i see)
- Add requirement_dev.txt for developer
- Add redis container on docker compose

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2024-05-07 11:43:33 +08:00
KevinHuSh
c6b6c748ae
fix file encoding detection bug (#653)
### What problem does this PR solve?

#651 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-05-07 10:01:24 +08:00
KevinHuSh
5f03a4de11
remove redis (#629)
### What problem does this PR solve?

### Type of change

- [x] Refactoring
2024-04-30 19:00:41 +08:00
KevinHuSh
7d3b68bb1e
refine code (#626)
### What problem does this PR solve?


### Type of change

- [x] Refactoring
2024-04-30 17:53:28 +08:00
KevinHuSh
cab274f560
remove PyMuPDF (#618)
### What problem does this PR solve?
#613 

### Type of change


- [x] Other (please describe):
2024-04-30 12:38:09 +08:00
KevinHuSh
674b3aeafd
fix disable and enable llm setting in dialog (#616)
### What problem does this PR solve?
#614 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-04-30 11:04:14 +08:00
KevinHuSh
2af74cc494
refine docker layers (#606)
### What problem does this PR solve?


### Type of change

- [x] Performance Improvement
2024-04-29 17:57:40 +08:00
KevinHuSh
8acc01a227
refine redis connection (#599)
### What problem does this PR solve?

#591 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-04-29 08:52:38 +08:00
KevinHuSh
8c07992b6c
refine code (#595)
### What problem does this PR solve?

### Type of change

- [x] Refactoring
2024-04-28 19:13:33 +08:00
KevinHuSh
9d60a84958
refactor code (#583)
### What problem does this PR solve?

### Type of change

- [x] Refactoring
2024-04-28 13:19:54 +08:00
KevinHuSh
944776f207
fix bug about fetching file from minio (#574)
### What problem does this PR solve?


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-04-28 09:57:40 +08:00
Jin Hai
f1c98aad6b
Update version info (#564)
### What problem does this PR solve?

_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change

- [x] Documentation Update
- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2024-04-26 20:07:26 +08:00
KevinHuSh
66f8d35632
Refactor (#537)
### What problem does this PR solve?

### Type of change

- [x] Refactoring
2024-04-25 14:14:28 +08:00
KevinHuSh
369400c483
fix bug of table in docx (#510)
### What problem does this PR solve?
#509 
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-04-23 19:10:33 +08:00
KevinHuSh
aa71462a9f
fix bug #502 (#504)
### What problem does this PR solve?

#502 
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-04-23 16:01:46 +08:00
chrysanthemum-boy
72384b191d
Add .doc file parser. (#497)
### What problem does this PR solve?
Add `.doc` file parser, using tika.
```
pip install tika
```
```
from tika import parser
from io import BytesIO

def extract_text_from_doc_bytes(doc_bytes):
    file_like_object = BytesIO(doc_bytes)
    parsed = parser.from_buffer(file_like_object)
    return parsed["content"]
```
### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: chrysanthemum-boy <fannc@qq.com>
2024-04-23 15:31:43 +08:00
KevinHuSh
0dfc8ddc0f
enlarge docker memory usage (#501)
### What problem does this PR solve?

### Type of change

- [x] Refactoring
2024-04-23 14:41:10 +08:00
KevinHuSh
a38e163035
remove doc from supported processing types (#488)
### What problem does this PR solve?
#474 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-04-22 15:46:09 +08:00
KevinHuSh
3610e1e5b4
fix ollama issuet push (#486)
### What problem does this PR solve?

#477 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-04-22 15:13:01 +08:00
Shaun
11949f9f2e
feat: support markdown files (#483)
parse markdown files as txt

### What problem does this PR solve?

support markdown files

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2024-04-22 14:43:36 +08:00
KevinHuSh
b8e58fe27a
add redis to accelerate access of minio (#482)
### What problem does this PR solve?

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2024-04-22 14:11:09 +08:00
KevinHuSh
7e41b4bc94
change readme for 0.3.0 release (#459)
### What problem does this PR solve?


### Type of change

- [x] Documentation Update
2024-04-19 18:19:15 +08:00
KevinHuSh
ed6081845a
Fit a lot of encodings for text file. (#458)
### What problem does this PR solve?

#384

### Type of change

- [x] Performance Improvement
2024-04-19 18:02:53 +08:00
KevinHuSh
453c29170f
make sure the models will not be load twice (#422)
### What problem does this PR solve?

#381 
### Type of change

- [x] Refactoring
2024-04-18 09:37:23 +08:00
YC
e8570da856
Update table.py to convert clmns to string (#414)
### What problem does this PR solve?


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-04-17 19:48:11 +08:00
KevinHuSh
800b5c7aaa
fix bulk error for table method (#407)
### What problem does this PR solve?


Issue link:#366

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-04-17 12:17:14 +08:00
KevinHuSh
d4e0bfc8a5
fix gb2312 encoding issue (#394)
### What problem does this PR solve?

Issue link:#384
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-04-16 19:45:14 +08:00
KevinHuSh
890561703b
Add bce-embedding and fastembed (#383)
### What problem does this PR solve?


Issue link:#326

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2024-04-16 16:42:19 +08:00
Anush
826ad6a33a
feat: FastEmbed embedding support (#291)
### Description

Following up on https://github.com/infiniflow/ragflow/pull/275, this PR
adds support for FastEmbed model configurations.

The options are not exhaustive. You can find the full list
[here](https://qdrant.github.io/fastembed/examples/Supported_Models/).

P.S. I ran into OOM issues when building the image.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: KevinHuSh <kevinhu.sh@gmail.com>
2024-04-15 15:58:06 +08:00
KevinHuSh
c39b751600
conversation API backend update (#360)
### What problem does this PR solve?


Issue link:#345

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2024-04-15 14:43:44 +08:00
KevinHuSh
8ffc09cb5c
Support Xinference (#321)
### What problem does this PR solve?

Issue link:#299

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2024-04-11 18:25:37 +08:00