36 Commits

Author SHA1 Message Date
Kevin Hu
cafdee536f
add sql to naive parser (#1908)
### What problem does this PR solve?


### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
2024-08-12 15:29:33 +08:00
黄腾
ede733e130
add support for eml file parser (#1768)
### What problem does this PR solve?

add support for eml file parser
#1363

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: Zhedong Cen <cenzhedong2@126.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2024-08-06 16:42:14 +08:00
zhuhao
5e19423d82
support reset the user email (#1735)
### What problem does this PR solve?

support reset the user email from old to new
#1723 

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2024-07-29 19:36:16 +08:00
zhuhao
792a1a9d91
add password reset function by extending the Flask command (#1632)
### What problem does this PR solve?
add password reset function by extending the Flask command. #1200 

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2024-07-23 14:02:41 +08:00
zhuhao
9352a09c53
modify the encryption to first perform base64 encoding and then encrypt (#1621)
### What problem does this PR solve?

The encryption should first perform base64 encoding and then encrypt, to
maintain consistency with the frontend

#1620 

### Type of change

- [x] Refactoring
2024-07-22 09:24:45 +08:00
cecilia-uu
1defc83506
API: create update_doc method (#1341)
### What problem does this PR solve?

Adds the API method of updating documents.


### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2024-07-03 15:14:34 +08:00
KevinHuSh
abcd3d2469
refactor (#1124)
### What problem does this PR solve?


### Type of change

- [x] Refactoring
2024-06-12 11:02:15 +08:00
Jin Hai
cf2f6592dd
API: create dataset (#1106)
### What problem does this PR solve?

This PR have finished 'create dataset' of both HTTP API and Python SDK.
HTTP API:
```
curl --request POST --url http://<HOST_ADDRESS>/api/v1/dataset   --header 'Content-Type: application/json' --header 'Authorization: <ACCESS_KEY>' --data-binary '{
  "name": "<DATASET_NAME>"
}'
```

Python SDK:
```
from ragflow.ragflow import RAGFLow
ragflow = RAGFLow('<ACCESS_KEY>', 'http://127.0.0.1:9380')
ragflow.create_dataset("dataset1")

```

TODO: 
- ACCESS_KEY is the login_token when user login RAGFlow, currently.
RAGFlow should have the function that user can add/delete access_key.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update

---------

Signed-off-by: Jin Hai <haijin.chn@gmail.com>
2024-06-11 11:16:37 +08:00
Fakai Zhao
7eb69fe6d9
Supports obtaining PDF documents from web pages (#1107)
### What problem does this PR solve?

Knowledge base management supports crawling information from web pages
and generating PDF documents

### Type of change
- [x] New Feature (Support document from web pages)
2024-06-11 10:45:19 +08:00
Wang Baoling
d0951ee27b
fix: logger formater is not work (#1090)
### What problem does this PR solve?

as title

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-06-07 13:48:56 +08:00
KevinHuSh
0171082cc5
fix create dialog bug (#982)
### What problem does this PR solve?


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-05-30 09:25:05 +08:00
Zhedong Cen
8dd45459be
Add support for HTML file (#973)
### What problem does this PR solve?

Add support for HTML file

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2024-05-30 09:12:55 +08:00
KevinHuSh
c3bc72dfd9
fix too large thumbnail issue (#817)
### What problem does this PR solve?

#709

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-05-17 14:04:21 +08:00
KevinHuSh
95f809187e
add stream chat (#811)
### What problem does this PR solve?

#709 
### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2024-05-16 20:14:53 +08:00
dashi6174
6ff63ee2ba
Support for code files parse (#789)
### What problem does this PR solve?

_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2024-05-15 16:34:28 +08:00
KevinHuSh
cab274f560
remove PyMuPDF (#618)
### What problem does this PR solve?
#613 

### Type of change


- [x] Other (please describe):
2024-04-30 12:38:09 +08:00
KevinHuSh
674b3aeafd
fix disable and enable llm setting in dialog (#616)
### What problem does this PR solve?
#614 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-04-30 11:04:14 +08:00
KevinHuSh
2af74cc494
refine docker layers (#606)
### What problem does this PR solve?


### Type of change

- [x] Performance Improvement
2024-04-29 17:57:40 +08:00
KevinHuSh
f69ff39fa0
add file management feature (#560)
### What problem does this PR solve?

### Type of change

- [x] Documentation Update
2024-04-26 17:21:53 +08:00
chrysanthemum-boy
72384b191d
Add .doc file parser. (#497)
### What problem does this PR solve?
Add `.doc` file parser, using tika.
```
pip install tika
```
```
from tika import parser
from io import BytesIO

def extract_text_from_doc_bytes(doc_bytes):
    file_like_object = BytesIO(doc_bytes)
    parsed = parser.from_buffer(file_like_object)
    return parsed["content"]
```
### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: chrysanthemum-boy <fannc@qq.com>
2024-04-23 15:31:43 +08:00
KevinHuSh
b8e58fe27a
add redis to accelerate access of minio (#482)
### What problem does this PR solve?

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2024-04-22 14:11:09 +08:00
KevinHuSh
f6c7204002
refine log format (#312)
### What problem does this PR solve?

Issue link:#264
### Type of change


- [x] Documentation Update
- [x] Refactoring
2024-04-11 10:13:43 +08:00
KevinHuSh
121c7a5681
refine error response, add set api-key MD (#178) 2024-03-31 19:09:42 +08:00
KevinHuSh
fd7fcb5baf
apply pep8 formalize (#155) 2024-03-27 11:33:46 +08:00
KevinHuSh
39269d2f79
add dockerfile and fix trival bugs (#78) 2024-02-28 15:01:12 +08:00
KevinHuSh
d1c600d5d3
add ocr and recognizer demo, update README (#74) 2024-02-26 19:51:35 +08:00
KevinHuSh
7fd1eca582
init README of deepdoc, add picture processer. (#71)
* init README of deepdoc, add picture processer.

* add resume parsing
2024-02-23 18:28:12 +08:00
KevinHuSh
a8294f2168 Refine resume parts and fix bugs in retrival using sql (#66) 2024-02-19 19:22:17 +08:00
KevinHuSh
c5ea37cd30 Add resume parser and fix bugs (#59)
* Update .gitignore

* Update .gitignore

* Add resume parser and fix bugs
2024-02-07 19:27:23 +08:00
KevinHuSh
e6acaf6738 Add Q&A and Book, fix task running bugs (#50) 2024-02-01 18:53:56 +08:00
KevinHuSh
072f9dd5bc Add app to rag module: presentaion & laws (#43) 2024-01-25 18:57:39 +08:00
KevinHuSh
e32ef75e99 Test chat API and refine ppt chunker (#42) 2024-01-23 19:45:36 +08:00
KevinHuSh
34b2ab3b2f Test APIs and fix bugs (#41) 2024-01-22 19:51:38 +08:00
KevinHuSh
484e5abc1f llm configuation refine and trievalTest API refine (#40) 2024-01-19 19:51:57 +08:00
KevinHuSh
9bf75d4511 add dialog api (#33) 2024-01-17 20:20:42 +08:00
KevinHuSh
6be3dd56fa rename web_server to api (#29)
* add front end code

* change licence

* rename web_server to API

* change name to web_server
2024-01-17 09:43:27 +08:00