237 Commits

Author SHA1 Message Date
Kevin Hu
cafdee536f
add sql to naive parser (#1908)
### What problem does this PR solve?


### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
2024-08-12 15:29:33 +08:00
黄腾
e9e39d57ce
add support for Upstage (#1902)
### What problem does this PR solve?

#1853  add support for Upstage

### Type of change


- [x] New Feature (non-breaking change which adds functionality)

Co-authored-by: Zhedong Cen <cenzhedong2@126.com>
2024-08-12 11:06:25 +08:00
黄腾
94cb66ba80
add support for TogetherAI (#1890)
### What problem does this PR solve?

#1853 add support for TogetherAI

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: Zhedong Cen <cenzhedong2@126.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2024-08-12 10:15:21 +08:00
黄腾
9a6dc89156
add support for PerfXCloud (#1883)
### What problem does this PR solve?

#1853  add support for PerfXCloud

### Type of change


- [x] New Feature (non-breaking change which adds functionality)

Co-authored-by: Zhedong Cen <cenzhedong2@126.com>
2024-08-12 10:11:50 +08:00
H
fdd5b1b8cf
Fix token list , stats in api app.py (#1896)
### What problem does this PR solve?

#1842 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-08-09 19:03:01 +08:00
黄腾
411c645134
Enhance the robustness of the code (#1879)
### What problem does this PR solve?

Enhance the robustness of the code

### Type of change

- [x] Refactoring

---------

Co-authored-by: Zhedong Cen <cenzhedong2@126.com>
2024-08-09 10:18:08 +08:00
黄腾
aed1bbbcaa
add supprot for lepton (#1866)
### What problem does this PR solve?

add supprot for lepton
#1853

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

Co-authored-by: Zhedong Cen <cenzhedong2@126.com>
2024-08-08 12:09:50 +08:00
Kung Quang
19ded65c66
Fix a "TypeError: expected string or buffer bug" in docx files extracted using Knowledge Graph.#1859 (#1865)
### What problem does this PR solve?

Fix a "TypeError: expected string or buffer bug" in docx files extracted
using Knowledge Graph. #1859
```
Traceback (most recent call last):
  File "//Users/XXX/ragflow/rag/svr/task_executor.py", line 149, in build
    cks = chunker.chunk(row["name"], binary=binary, from_page=row["from_page"],
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/XXX/ragflow/rag/app/knowledge_graph.py", line 18, in chunk
    chunks = build_knowlege_graph_chunks(tenant_id, sections, callback,
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/XXX/ragflow/graphrag/index.py", line 87, in build_knowlege_graph_chunks
    tkn_cnt = num_tokens_from_string(chunks[i])
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/XXX/github/ragflow/rag/utils/__init__.py", line 79, in num_tokens_from_string
    num_tokens = len(encoder.encode(string))
                     ^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/XXX/tiktoken/core.py", line 116, in encode
    if match := _special_token_regex(disallowed_special).search(text):
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: expected string or buffer
```
This type is `Dict`
<img width="1689" alt="Pasted Graphic 3"
src="https://github.com/user-attachments/assets/e5ba5c45-df1d-4697-98c9-14365c839f20">
The correct type should be ` Str`
<img width="1725" alt="Pasted Graphic 2"
src="https://github.com/user-attachments/assets/e54d5e60-4ce4-4180-b394-24e485013534">

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2024-08-08 12:03:01 +08:00
黄腾
e34817c2a9
add support for cohere (#1849)
### What problem does this PR solve?

_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: Zhedong Cen <cenzhedong2@126.com>
2024-08-07 18:40:51 +08:00
黄腾
60428c4ad2
fix LocalAI add bug (#1851)
### What problem does this PR solve?

#1848 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: Zhedong Cen <cenzhedong2@126.com>
2024-08-07 18:10:42 +08:00
黄腾
ede733e130
add support for eml file parser (#1768)
### What problem does this PR solve?

add support for eml file parser
#1363

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: Zhedong Cen <cenzhedong2@126.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2024-08-06 16:42:14 +08:00
黄腾
b67484e77d
add supprot for OpenAI-API-Compatible llm (#1787)
### What problem does this PR solve?

#1771  add supprot for OpenAI-API-Compatible 

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: Zhedong Cen <cenzhedong2@126.com>
2024-08-06 16:20:21 +08:00
Wang
66e4113e0b
fix: align limitation with env (#1819)
### What problem does this PR solve?

try to process the large file will throw a File size exceeds error

### Type of change

- [x] Refactoring

Co-authored-by: Theta Wang (ncu) <chunshan.connect@gmail.com>
2024-08-06 16:04:51 +08:00
Kevin Hu
43199c45c3
refine loginfo about graprag progress (#1823)
### What problem does this PR solve?



### Type of change

- [x] Refactoring
2024-08-06 16:01:43 +08:00
Kevin Hu
fe797bcc66
be better chunks before graphrag (#1811)
### What problem does this PR solve?

#1594

### Type of change

- [x] Refactoring
2024-08-05 16:21:52 +08:00
Kevin Hu
2452c5624f
remove duplicated key in mind map (#1809)
### What problem does this PR solve?

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-08-05 15:57:33 +08:00
Kevin Hu
a5c03ccd4c
refine mindmap prompt (#1808)
### What problem does this PR solve?



### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-08-05 15:33:44 +08:00
H
d2213141e0
Fix graphrag callback (#1806)
### What problem does this PR solve?

#1800 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-08-05 14:44:54 +08:00
Kevin Hu
152072f900
Add graphrag (#1793)
### What problem does this PR solve?

#1594

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2024-08-02 18:51:14 +08:00
黄腾
4ba1ba973a
fix jina module not find bug (#1779)
### What problem does this PR solve?

fix jina module not find bug

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

Co-authored-by: Zhedong Cen <cenzhedong2@126.com>
2024-08-01 19:52:56 +08:00
黄腾
ccbd4365be
refactor stepfun cv model (#1751)
### What problem does this PR solve?

refactor stepfun cv model

### Type of change

- [x] Refactoring

Co-authored-by: Zhedong Cen <cenzhedong2@126.com>
2024-07-31 15:30:47 +08:00
jianyongli
9169643157
add step-1v-8k cv model (#1686)
### What problem does this PR solve?

_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: lijianyong <lijianyong@stepfun.com>
2024-07-30 16:57:27 +08:00
黄腾
5cff780ec4
lower openai version in requirements.txt (#1747)
### What problem does this PR solve?

lower openai version in requirements.txt

### Type of change

- [x] Refactoring

Co-authored-by: Zhedong Cen <cenzhedong2@126.com>
2024-07-30 16:55:59 +08:00
Wang Baoling
74ebc497c1
fix: ERROR: 'CompletionUsage' object has no attribute 'get' (#1736)
### What problem does this PR solve?

_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-07-30 15:12:16 +08:00
黄腾
ff8702f7de
add support for LocalLLM (#1744)
### What problem does this PR solve?

add support for LocalLLM

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: Zhedong Cen <cenzhedong2@126.com>
2024-07-30 14:07:00 +08:00
Yuhao Tsui
a973b9e01f
Fix: Embedding err when docx contains unsupported images (#1720)
### What problem does this PR solve?

Fix the problem of not being able to embedding when docx document
contains unsupported images.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2024-07-29 19:38:47 +08:00
黄腾
29f7f8b81e
fix MiniMax chat bug (#1733)
### What problem does this PR solve?

#1717   fix MiniMax chat bug

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

Co-authored-by: Zhedong Cen <cenzhedong2@126.com>
2024-07-29 19:35:16 +08:00
H
0cb588f7bf
Fix docx parser line bug (#1715)
### What problem does this PR solve?
#1704 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2024-07-29 10:06:02 +08:00
黄腾
61096596bc
fix OpenAI llm return bug (#1728)
### What problem does this PR solve?

fix OpenAI llm return bug

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-07-29 09:21:31 +08:00
H
79c873344b
Fix docs parser (#1714)
### What problem does this PR solve?

#1711 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-07-26 10:52:56 +08:00
Kevin Hu
c92d334b29
fix bug of regx (#1703)
### What problem does this PR solve?

#1689

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-07-25 14:30:58 +08:00
江不江
bc50f68127
fix embedding_model (#1698)
### What problem does this PR solve?
fix embedding_model #1692
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

Signed-off-by: seaver <zhudan187@qq.com>
2024-07-25 11:43:43 +08:00
黄腾
e67bfca552
refactor some llm api using openai api format (#1692)
### What problem does this PR solve?

refactor some llm api using openai api format

### Type of change

- [x] Refactoring

---------

Co-authored-by: Zhedong Cen <cenzhedong2@126.com>
2024-07-25 10:23:35 +08:00
Kung Quang
32d5885b68
Fix api reference empty bug (#1655)
### What problem does this PR solve?

fix api reference empty bug
```
for chunk_i in answer['reference'].get('chunks',[]):
                   ^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'list' object has no attribute 'get'
```
```
return np.array([d["relevance_score"] for d in res["results"]]), res["meta"]["tokens"]["input_tokens"]+res["meta"]["tokens"]["output_tokens"]
                                                   ~~~^^^^^^^^^^^
KeyError: 'results'
```
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-07-24 18:02:22 +08:00
黄腾
d96348eb22
add support for LM Studio (#1663)
### What problem does this PR solve?

#1602 

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: Zhedong Cen <cenzhedong2@126.com>
2024-07-24 12:46:43 +08:00
cHz
4b195cc14c
fix: Misspelled Variable Name (#1662)
### What problem does this PR solve?


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-07-24 11:14:46 +08:00
jianyongli
60356b52c6
Feat stepfun (#1659)
### What problem does this PR solve?

#1661
#1660

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: lijianyong <lijianyong@stepfun.com>
2024-07-24 10:49:37 +08:00
江不江
d2b70e73dd
fix redis no such key (#1647)
### What problem does this PR solve?
fix Redis no such key #1614

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Signed-off-by: seaver <zhudan187@qq.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2024-07-23 14:00:31 +08:00
黄腾
b4a281eca1
add support for NVIDIA llm (#1645)
### What problem does this PR solve?

add support for NVIDIA llm
### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: Zhedong Cen <cenzhedong2@126.com>
2024-07-23 10:43:09 +08:00
H
ac7a0d4fbf
Add ParsertType Audio (#1637)
### What problem does this PR solve?

#1514 

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2024-07-22 19:17:30 +08:00
H
29a7b7a040
Add sequence2txt model.py (#1633)
### What problem does this PR solve?

#1514 

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2024-07-22 14:52:08 +08:00
H
58df013722
Chat Use CVmodel (#1607)
### What problem does this PR solve?

#1230 

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2024-07-19 18:36:34 +08:00
黄腾
347cb61f26
add support for StepFun (#1611)
### What problem does this PR solve?

#1561 

### Type of change
- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: Zhedong Cen <cenzhedong2@126.com>
2024-07-19 16:26:12 +08:00
黄腾
3fcdba1683
add support for LocalAI (#1608)
### What problem does this PR solve?

#762 

### Type of change
- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: Zhedong Cen <cenzhedong2@126.com>
2024-07-19 15:50:28 +08:00
Kevin Hu
be6d5b76c3
fix embeding model for Azure (#1601)
### What problem does this PR solve?

#1599

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-07-19 09:22:59 +08:00
黄腾
9ae57eb370
fix MiniMax api error (#1567)
### What problem does this PR solve?

#1353 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: Zhedong Cen <cenzhedong2@126.com>
2024-07-17 15:32:51 +08:00
黄腾
75086f41a9
'load llm infomation from a json file and add support for OpenRouter' (#1533)
### What problem does this PR solve?

#1467 

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: Zhedong Cen <cenzhedong2@126.com>
2024-07-16 15:19:43 +08:00
zhuhao
3657b1f2a2
fix the tokens error that occurred when adding the xinference model (#1527)
### What problem does this PR solve?

fix the tokens error that occurred when adding the xinference model
#1522 

root@pc-gpu-86-41:~# curl -X 'POST' 'http://127.0.0.1:9997/v1/rerank' -H
'accept: application/json' -H 'Content-Type: application/json' -d '{
"model": "bge-reranker-v2-m3",
"query": "A man is eating pasta.",
"return_documents":"true",
"return_len":"true",
"documents": [
"A man is eating food.",
"A man is eating a piece of bread.",
"The girl is carrying a baby.",
"A man is riding a horse.",
"A woman is playing violin."
]
}'

{"id":"610a8724-3e96-11ef-81ce-08bfb886c012","results":[{"index":0,"relevance_score":0.999574601650238,"document":{"text":"A
man is eating
food."}},{"index":1,"relevance_score":0.07814773917198181,"document":{"text":"A
man is eating a piece of
bread."}},{"index":3,"relevance_score":0.000017700713215162978,"document":{"text":"A
man is riding a
horse."}},{"index":2,"relevance_score":0.0000163753629749408,"document":{"text":"The
girl is carrying a
baby."}},{"index":4,"relevance_score":0.00001631895975151565,"document":{"text":"A
woman is playing
violin."}}],"meta":{"api_version":null,"billed_units":null,"tokens":{"input_tokens":38,"output_tokens":38},"warnings":null}}

### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2024-07-16 15:08:51 +08:00
Kevin Hu
607de74ace
fix minimax bug (#1528)
### What problem does this PR solve?

#1353 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2024-07-16 10:55:33 +08:00
Kevin Hu
26de9adb41
refine db connection (#1494)
### What problem does this PR solve?



### Type of change


- [x] Refactoring
2024-07-12 18:11:34 +08:00