133 Commits

Author SHA1 Message Date
KevinHuSh
7eee193956
fix #917 #915 (#946)
### What problem does this PR solve?

#917 
#915

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-05-28 11:13:02 +08:00
yungongzi
9ffd7ae321
Added support for Baichuan LLM (#934)
### What problem does this PR solve?

- Added support for Baichuan LLM

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

Co-authored-by: 海贼宅 <stu_xyx@163.com>
2024-05-28 09:09:37 +08:00
KevinHuSh
46454362d7
fix raptor bugs (#928)
### What problem does this PR solve?

#922 
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-05-27 11:01:20 +08:00
yungongzi
c0d71adaa2
Bug fix for volcengine (#909)
### What problem does this PR solve?
Bug fixes for the VolcEngine

- Bug fix for front-end configuration code of VolcEngine

- Bug fix for tokens counting logic of VolcEngine


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: 海贼宅 <stu_xyx@163.com>
2024-05-24 11:34:39 +08:00
KevinHuSh
6f99bbbb08
add raptor (#899)
### What problem does this PR solve?

#882 

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2024-05-23 14:31:16 +08:00
yungongzi
eb51ad73d6
Add support for VolcEngine - the current version supports SDK2 (#885)
- The main idea is to assemble **ak**, **sk**, and **ep_id** into a
dictionary and store it in the database **api_key** field
- I don’t know much about the front-end, so I learned from Ollama, which
may be redundant.

### Configuration method

- model name

- Format requirements: {"VolcEngine model name":"endpoint_id"}
    - For example: {"Skylark-pro-32K":"ep-xxxxxxxxx"}
    
- Volcano ACCESS_KEY
- Format requirements: VOLC_ACCESSKEY of the volcano engine
corresponding to the model

- Volcano SECRET_KEY
- Format requirements: VOLC_SECRETKEY of the volcano engine
corresponding to the model
    
### What problem does this PR solve?

_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2024-05-23 11:15:29 +08:00
dashi6174
21453ffff0
fixed: The choices may be empty. (#876)
### What problem does this PR solve?


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-05-22 15:29:07 +08:00
GYH
be13429d05
Add api/list_kb_docs function and modify api/list_chunks (#874)
### What problem does this PR solve?
#717 

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2024-05-22 14:58:56 +08:00
KevinHuSh
a12fcf9156
fix minio helth bug (#850)
### What problem does this PR solve?

#643 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-05-20 19:35:30 +08:00
GYH
c27c02ea67
Split Excel file into different chunks (#847)
### What problem does this PR solve?


Split Excel into different chunk
### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2024-05-20 18:35:15 +08:00
KevinHuSh
a7bd427116
add locally deployed llm (#841)
### What problem does this PR solve?


### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2024-05-20 12:40:59 +08:00
KevinHuSh
2b36283712
fix english query bug (#840)
### What problem does this PR solve?

#834 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-05-20 12:23:51 +08:00
KevinHuSh
f6a599461f
fix zhipuAI stream issue (#825)
### What problem does this PR solve?


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-05-17 17:07:33 +08:00
KevinHuSh
d1614107e2
fix stream chat for ollama (#816)
### What problem does this PR solve?

#709

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-05-17 12:07:00 +08:00
KevinHuSh
95f809187e
add stream chat (#811)
### What problem does this PR solve?

#709 
### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2024-05-16 20:14:53 +08:00
dashi6174
6ff63ee2ba
Support for code files parse (#789)
### What problem does this PR solve?

_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2024-05-15 16:34:28 +08:00
KevinHuSh
4d47b2b459
fix a string format error (#781)
### What problem does this PR solve?


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-05-15 13:02:31 +08:00
KevinHuSh
aa1c915d6e
support gpt-4o (#773)
### What problem does this PR solve?
#771 

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2024-05-15 11:16:08 +08:00
Jin Hai
77b1520b66
Refactor message output format (#772)
### What problem does this PR solve?

_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change

- [x] Refactoring

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2024-05-15 10:48:42 +08:00
KevinHuSh
ffe5737f7d
let index be batchly. (#733)
### What problem does this PR solve?

let index be batchly.

### Type of change


- [x] Refactoring
2024-05-11 19:47:53 +08:00
KevinHuSh
04a9e95161
let file in knowledgebases visible in file manager (#714)
### What problem does this PR solve?

Let file in knowledgebases visible in file manager.
#162 

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2024-05-11 16:04:28 +08:00
KevinHuSh
648a2baaa9
fix disabled doc is still retreivalable (#695)
### What problem does this PR solve?

Fix that disabled doc is still retreivalable

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-05-09 15:32:24 +08:00
KevinHuSh
4153a36683
truncate text to fitin embedding model (#692)
### What problem does this PR solve?


### Type of change

- [x] Refactoring
2024-05-09 11:35:08 +08:00
KevinHuSh
c28f7b5d38
make sure the error will be recorded. (#672)
### What problem does this PR solve?


### Type of change

- [x] Refactoring
2024-05-08 13:58:41 +08:00
KevinHuSh
eb27a4309e
add support for deepseek (#668)
### What problem does this PR solve?

#666 

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2024-05-08 10:30:02 +08:00
KevinHuSh
a6e4b74d94
remove unused dependency (#664)
### What problem does this PR solve?

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-05-07 19:46:17 +08:00
KevinHuSh
7013d7f620
refine text decode (#657)
### What problem does this PR solve?
#651 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-05-07 12:25:47 +08:00
Fakai Zhao
de839fc3f0
optimize srv broker and executor logic (#630)
### What problem does this PR solve?

Optimize task broker and executor for reduce memory usage and deployment
complexity.

### Type of change
- [x] Performance Improvement
- [x] Refactoring

### Change Log
- Enhance redis utils for message queue(use stream)
- Modify task broker logic via message queue (1.get parse event from
message queue 2.use ThreadPoolExecutor async executor )
- Modify the table column name of document and task (process_duation ->
process_duration maybe just a spelling mistake)
- Reformat some code style(just what i see)
- Add requirement_dev.txt for developer
- Add redis container on docker compose

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2024-05-07 11:43:33 +08:00
KevinHuSh
c6b6c748ae
fix file encoding detection bug (#653)
### What problem does this PR solve?

#651 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-05-07 10:01:24 +08:00
KevinHuSh
5f03a4de11
remove redis (#629)
### What problem does this PR solve?

### Type of change

- [x] Refactoring
2024-04-30 19:00:41 +08:00
KevinHuSh
7d3b68bb1e
refine code (#626)
### What problem does this PR solve?


### Type of change

- [x] Refactoring
2024-04-30 17:53:28 +08:00
KevinHuSh
cab274f560
remove PyMuPDF (#618)
### What problem does this PR solve?
#613 

### Type of change


- [x] Other (please describe):
2024-04-30 12:38:09 +08:00
KevinHuSh
674b3aeafd
fix disable and enable llm setting in dialog (#616)
### What problem does this PR solve?
#614 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-04-30 11:04:14 +08:00
KevinHuSh
2af74cc494
refine docker layers (#606)
### What problem does this PR solve?


### Type of change

- [x] Performance Improvement
2024-04-29 17:57:40 +08:00
KevinHuSh
8acc01a227
refine redis connection (#599)
### What problem does this PR solve?

#591 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-04-29 08:52:38 +08:00
KevinHuSh
8c07992b6c
refine code (#595)
### What problem does this PR solve?

### Type of change

- [x] Refactoring
2024-04-28 19:13:33 +08:00
KevinHuSh
9d60a84958
refactor code (#583)
### What problem does this PR solve?

### Type of change

- [x] Refactoring
2024-04-28 13:19:54 +08:00
KevinHuSh
944776f207
fix bug about fetching file from minio (#574)
### What problem does this PR solve?


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-04-28 09:57:40 +08:00
Jin Hai
f1c98aad6b
Update version info (#564)
### What problem does this PR solve?

_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change

- [x] Documentation Update
- [x] Refactoring

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2024-04-26 20:07:26 +08:00
KevinHuSh
66f8d35632
Refactor (#537)
### What problem does this PR solve?

### Type of change

- [x] Refactoring
2024-04-25 14:14:28 +08:00
KevinHuSh
369400c483
fix bug of table in docx (#510)
### What problem does this PR solve?
#509 
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-04-23 19:10:33 +08:00
KevinHuSh
aa71462a9f
fix bug #502 (#504)
### What problem does this PR solve?

#502 
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-04-23 16:01:46 +08:00
chrysanthemum-boy
72384b191d
Add .doc file parser. (#497)
### What problem does this PR solve?
Add `.doc` file parser, using tika.
```
pip install tika
```
```
from tika import parser
from io import BytesIO

def extract_text_from_doc_bytes(doc_bytes):
    file_like_object = BytesIO(doc_bytes)
    parsed = parser.from_buffer(file_like_object)
    return parsed["content"]
```
### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: chrysanthemum-boy <fannc@qq.com>
2024-04-23 15:31:43 +08:00
KevinHuSh
0dfc8ddc0f
enlarge docker memory usage (#501)
### What problem does this PR solve?

### Type of change

- [x] Refactoring
2024-04-23 14:41:10 +08:00
KevinHuSh
a38e163035
remove doc from supported processing types (#488)
### What problem does this PR solve?
#474 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-04-22 15:46:09 +08:00
KevinHuSh
3610e1e5b4
fix ollama issuet push (#486)
### What problem does this PR solve?

#477 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-04-22 15:13:01 +08:00
Shaun
11949f9f2e
feat: support markdown files (#483)
parse markdown files as txt

### What problem does this PR solve?

support markdown files

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2024-04-22 14:43:36 +08:00
KevinHuSh
b8e58fe27a
add redis to accelerate access of minio (#482)
### What problem does this PR solve?

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2024-04-22 14:11:09 +08:00
KevinHuSh
7e41b4bc94
change readme for 0.3.0 release (#459)
### What problem does this PR solve?


### Type of change

- [x] Documentation Update
2024-04-19 18:19:15 +08:00
KevinHuSh
ed6081845a
Fit a lot of encodings for text file. (#458)
### What problem does this PR solve?

#384

### Type of change

- [x] Performance Improvement
2024-04-19 18:02:53 +08:00