2740 Commits

Author SHA1 Message Date
liu an
2a5ad74ac6
Test: Update test cases for #6800 (#6804)
### What problem does this PR solve?

update test case for PR #6800 issue #6539

### Type of change

- [x] update test cases
2025-04-03 21:22:41 +08:00
Kevin Hu
2caf15b24c
Refa: trival. (#6802)
### What problem does this PR solve?


### Type of change


- [x] Refactoring
2025-04-03 19:01:24 +08:00
balibabu
f49588756e
Feat: Load the dialog page, prohibit calling the dialog/get interface #6798 (#6799)
### What problem does this PR solve?

Feat: Load the dialog page, prohibit calling the dialog/get interface
#6798

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-04-03 18:04:40 +08:00
liu an
57e760883e
Fix: update chunk, empty question issue. (#6800)
### What problem does this PR solve?

fix issue #6539, refer to pr #6405

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-04-03 18:04:19 +08:00
liu an
b213e88cca
Test: Added test cases for List Chat Assistants HTTP API (#6792)
### What problem does this PR solve?

cover [list chat
assistant](https://ragflow.io/docs/v0.17.2/http_api_reference#list-chat-assistants)
endpoints

### Type of change

- [x] add test cases
2025-04-03 17:22:23 +08:00
zunceng
e8f46c9207
Fix: missing redis pvc storageclass in helm (#6788)
fix redis pvc in helm deployment

### What problem does this PR solve?

_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-04-03 16:55:47 +08:00
so95
cded812b97
Feat: add OpenAI compatible API for agent (#6329)
### What problem does this PR solve?
add openai agent
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-04-03 16:51:37 +08:00
balibabu
2acb02366e
Feat: Clarify the use of OpenAI-API-compatible #6782 (#6783)
### What problem does this PR solve?

Feat: Clarify the use of OpenAI-API-compatible #6782

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-04-03 11:38:21 +08:00
Kevin Hu
9ecc78feeb
Refa: copywriting refinement. (#6779)
### What problem does this PR solve?

Close #6762

### Type of change

- [x] Refactoring
2025-04-03 11:38:02 +08:00
Zhichang Yu
fdc410e743
Fix set_graph on non-existing edge (#6777)
### What problem does this PR solve?

Fix set_graph on non-existing edge

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-04-03 11:09:04 +08:00
Kevin Hu
5b5558300a
Feat: add gemini-2.5-pro-exp-03-25 (#6774)
### What problem does this PR solve?

#6733

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-04-03 10:48:58 +08:00
liu an
b5918e7158
Docs: Fix for issue #6713 (#6775)
### What problem does this PR solve?

update fo issue #6713

### Type of change

- [x] Documentation Update
2025-04-03 10:19:58 +08:00
liu an
58f8026632
Test: Update test cases for PR #6643 (#6766)
### What problem does this PR solve?

Update test cases for PR #6643 issue #6607

### Type of change

- [x] update test cases
2025-04-03 10:10:40 +08:00
liwenju0
a73fbc61ff
Fix: Handle the case of deleting empty blocks. Update the relevant message (#6643)
…gic to return the correct deletion message. Add handling for empty
arrays to ensure no errors occur during the deletion operation. Update
the test cases to verify the new logic.

### What problem does this PR solve?

fix this bug:https://github.com/infiniflow/ragflow/issues/6607

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: wenju.li <wenju.li@deepctr.cn>
2025-04-02 19:20:17 +08:00
liu an
0d1c5fdd2f
Test: Added test cases for Create Chat Assistant HTTP API (#6763)
### What problem does this PR solve?

cover [create chat
assistant](https://ragflow.io/docs/v0.17.2/http_api_reference#create-chat-assistant)
endpoints

### Type of change

- [x] add test cases
2025-04-02 18:49:59 +08:00
liu an
6c77ef5a5e
Docs(api): align default values in create chat assistant HTTP API dos with implementation (#6764)
### What problem does this PR solve?

align default values in create chat assistant HTTP API dos with
implementation.
llm.presence_penalty  0.2 -> 0.4
prompt.top_n  8->6


### Type of change

- [x] Documentation Update
2025-04-02 18:48:31 +08:00
Zhichang Yu
e7a2a4b7ff
Log llm response on exception (#6750)
### What problem does this PR solve?

Log llm response on exception

### Type of change

- [x] Refactoring
2025-04-02 17:10:57 +08:00
balibabu
724a36fcdb
Fix: Issue with Markdown Code Blocks Breaking Frontend Layout #5789 (#6758)
### What problem does this PR solve?

Fix: Issue with Markdown Code Blocks Breaking Frontend Layout #5789

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-04-02 16:28:55 +08:00
liwenju0
9ce6521582
Fix: Change the field name of the document ID from "documents" to "do… (#6753)
…cument_ids" to maintain consistency.

### What problem does this PR solve?

Close #6752

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

Co-authored-by: wenju.li <wenju.li@deepctr.cn>
2025-04-02 15:52:52 +08:00
balibabu
160bf4ccb3
Fix: The file upload prompt indicates "No authorization." #6516 (#6756)
### What problem does this PR solve?

Fix: The file upload prompt indicates "No authorization." #6516

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-04-02 15:52:35 +08:00
balibabu
aa25d09b0c
Fix: Using the Enter key does not send a complete message #6754 (#6755)
### What problem does this PR solve?

Fix: Using the Enter key does not send a complete message #6754

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-04-02 15:02:16 +08:00
writinwaters
2471a6e115
Updated max_tokens descriptions (#6751)
### What problem does this PR solve?

#6721 

### Type of change


- [x] Documentation Update
2025-04-02 13:56:55 +08:00
balibabu
fc02929946
Feat: Support deleting knowledge graph #6747 (#6748)
### What problem does this PR solve?

Feat: Support deleting knowledge graph #6747

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-04-02 11:20:37 +08:00
liu an
3ae1e9e3c4
Test: Skip test case for PR 6443 (#6724)
### What problem does this PR solve?

Skip test case for PR #6443

### Type of change

- [x] update test cases
2025-04-02 10:41:01 +08:00
balibabu
117f18240d
Feat: Add a notification logic to the team member invite feature #6610 (#6729)
### What problem does this PR solve?
Feat: Add a notification logic to the team member invite feature #6610

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-04-02 09:15:13 +08:00
writinwaters
31296ad70f
Miscellaneous doc updates and refactored team management doc. (#6730)
### What problem does this PR solve?

#5576, #6672

### Type of change


- [x] Documentation and UI Update
2025-04-01 19:05:30 +08:00
balibabu
132eae9d5b
Feat: Interrupt streaming #6515 (#6723)
### What problem does this PR solve?

Feat: Interrupt streaming #6515
### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-04-01 17:26:54 +08:00
kaiyuan Zhang
ead5f7aba9
Fix infinite recursion in RagTokenizer when processing repetitive characters (#6109)
### What problem does this PR solve?
fix #6085 
RagTokenizer's dfs_() function falls into infinite recursion when
processing text with repetitive Chinese characters (e.g.,
"一一一一一十一十一十一..." or "一一一一一一十十十十十十十二十二十二..."), causing memory leaks.
### Type of change
Implemented three optimizations to the dfs_() function:
1.Added memoization with _memo dictionary to cache computed results
2.Added recursion depth limiting with _depth parameter (max 10 levels)
3.Implemented special handling for repetitive character sequences
- [x] Bug Fix (non-breaking change which fixes an issue)

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-04-01 13:59:52 +08:00
liu an
58e6e7b668
Test: Refactor test fixtures and test cases (#6709)
### What problem does this PR solve?

 Refactor test fixtures and test cases

### Type of change

- [ ] Refactoring test cases
2025-04-01 13:39:07 +08:00
Yue-Lyu123
20b8ccd1e9
Hotfix ece5903 (#6705)
I'm really sorry, I found that in graphrag/general/extractor.py under
def __call__, the line change.removed_nodes.extend(nodes[1:]) causes an
AttributeError: 'set' object has no attribute 'extend'. Could you please
merge the branch e666528 again? I made some modifications.
2025-04-01 12:06:28 +08:00
balibabu
d0dca16fee
Feat: Allows users to search for models in the model selection drop-down box #3221 (#6708)
### What problem does this PR solve?

Feat: Allows users to search for models in the model selection drop-down
box #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-04-01 11:53:48 +08:00
Kevin Hu
fc21dd0a4a
Feat: add qwq-plus-latest (#6702)
### What problem does this PR solve?

#6697

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-04-01 11:06:03 +08:00
Kevin Hu
61c0dfab70
Fix: Email error. (#6701)
### What problem does this PR solve?

#6695

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-04-01 10:37:04 +08:00
Yue-Lyu123
67330833af
fix: correct [AttributeError: 'set' object has no attribute 'nodes' T… (#6699)
### Related Issue: 
https://github.com/infiniflow/ragflow/issues/6653 

### Environment:
Using nightly version [ece5903]

Elasticsearch database

Thanks for the review! My fault! I realize my initial testing wasn't
passed.

In graphrag/entity_resolution.py 
 `sub_connect_graph` is a set like` {'HELLO', 'Hi', 'How are you'}`, 
Neither accessing `.nodes` nor `.nodes()` will work, **it still causes
`AttributeError: 'set' object has no attribute 'nodes'`**

In graphrag/general/extractor.py  
The `list.extend() `method performs an in-place operation, directly
modifying the original list and returning ‘None’ rather than the
modified list.
Neither accessing
`sorted(set(node0_attrs[attr].extend(node1_attrs.get(attr, []))))` nor
`sorted(set(node0_attrs[attr].extend(node1_attrs[attr])))` will work,
**it still causes `TypeError: 'NoneType' object is not iterable`**
### Type of change

- [ ] Bug Fix AttributeError: graphrag/entity_resolution.py 
- [ ] Bug Fix TypeError: graphrag/general/extractor.py
2025-04-01 09:38:21 +08:00
Yue-Lyu123
ece59034f7
fix: Resolve KnowledgeGraph entity resolution errors (#6653) (#6691)
### Related Issue: #6653
### Environment:

Using nightly version

Elasticsearch database

### Bug Description:
When clicking the "Entity Resolution" button in KnowledgeGraph,
encountered the following errors:

graphrag/entity_resolution.py

```
list(sub_connect_graph.nodes) AttributeError
```

graphrag/general/extractor.py
```
node0_attrs[attr] = sorted(set(node0_attrs[attr].extend(node1_attrs[attr])))
TypeError: 'NoneType' object is not iterable
```
```
for attr in ["keywords", "source_id"]:  
 KeyError I think attribute "keywords" is in edges not nodes
```
graphrag/utils.py
```
settings.docStoreConn.delete()  # Sync function called as async
```
### Changes Made:

Fixed AttributeError in entity_resolution.py by properly handling graph
nodes

Fixed TypeError and KeyError in extractor.py by separate operations

Corrected async/sync mismatch in document deletion call
2025-03-31 22:31:35 +08:00
Kevin Hu
0a42e5777e
Refa: docker/.env comment refinement. (#6689)
### What problem does this PR solve?


### Type of change

- [x] Refactoring
2025-03-31 18:26:20 +08:00
RedBookOfMemory
e2b66628f4
Feat: extend S3 storage compatibility and add knowledge base ID prefix (#6355)
### What problem does this PR solve?

- Added support for S3-compatible protocols.
- Enabled the use of knowledge base ID as a file prefix when storing
files in S3.
- Updated docker/README.md to include detailed S3 and OSS configuration
instructions.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-03-31 16:09:43 +08:00
Alex Chen
46b5e32cd7
Feat: support vision llm for gpustack (#6636)
### What problem does this PR solve?
https://github.com/infiniflow/ragflow/issues/6138

This PR is going to support vision llm for gpustack, modify url path
from `/v1-openai` to `/v1`

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-03-31 15:33:52 +08:00
Kevin Hu
7d9dd1e5d3
Refa: remove default build-in rerank model. (#6682)
### What problem does this PR solve?

### Type of change

- [x] Refactoring
- [x] Performance Improvement
2025-03-31 15:33:19 +08:00
so95
1985ff7918
add type canvas (#6680)
add type canvas
### Type of change
- [x] Refactoring
2025-03-31 14:46:29 +08:00
Kevin Hu
60b9c027c8
Refa: add meta data to retrieval. (#6676)
### What problem does this PR solve?

#6619
### Type of change


- [x] Performance Improvement
2025-03-31 11:45:56 +08:00
writinwaters
2793c8e4fe
Added a guide on setting page rank. (#6645)
### What problem does this PR solve?


### Type of change


- [x] Documentation Update

---------

Co-authored-by: balibabu <cike8899@users.noreply.github.com>
2025-03-31 11:44:18 +08:00
Yingfeng
805a8f1f47
Update broken discord (#6678)
### Type of change

- [x] Documentation Update
2025-03-31 11:29:34 +08:00
Song Fuchang
d4a3e9a7cc
Fix table migration on non-exist-yet indexed columns. (#6666)
### What problem does this PR solve?

Fix #6334

Hello, I encountered the same problem in #6334. In the
`api/db/db_models.py`, it calls `obj.create_table()` unconditionally in
`init_database_tables`, before the `migrate_db()`. Specially for the
`permission` field of `user_canvas` table, it has `index=True`, which
causes `peewee` to issue a SQL trying to create the index when the field
does not exist (the `user_canvas` table already exists), so
`psycopg2.errors.UndefinedColumn: column "permission" does not exist`
occurred.

I've added a judgement in the code, to only call `create_table()` when
the table does not exist, delegate the migration process to
`migrate_db()`.

Then another problem occurs: the `migrate_db()` actually does nothing
because it failed on the first migration! The `playhouse` blindly issue
DDLs without things like `IF NOT EXISTS`, so it fails... even if the
exception is `pass`, the transaction is still rolled back. So I removed
the transaction in `migrate_db()` to make it work.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2025-03-31 11:27:20 +08:00
Song Fuchang
ad4e59edb2
Don't split and strip input in retrieval component. (#6662)
### What problem does this PR solve?

Actually fix #6241 

Hello, I ran into the same problem as #6241. When I'm testing my agent
flow in the web ui using `Run` button with a file input, the retrieval
component always gave an empty output.

In the code I found that:

`web/src/pages/flow/debug-content/index.tsx`:

```tsx
const onOk = useCallback(async () => {
    const values = await form.validateFields();
    const nextValues = Object.entries(values).map(([key, value]) => {
      const item = parameters[Number(key)];
      let nextValue = value;
      if (Array.isArray(value)) {
        nextValue = ``;

        value.forEach((x) => {
          nextValue +=
            x?.originFileObj instanceof File
              ? `${x.name}\n${x.response?.data}\n----\n`    // Here, the file content always ends in '\n'
              : `${x.url}\n${x.result}\n----\n`;
        });
      }
      return { ...item, value: nextValue };
    });

    ok(nextValues);
  }, [form, ok, parameters]);
```

while in the `agent/component/retrieval.py`:

```python
def _run(self, history, **kwargs):
        query = self.get_input()
        query = str(query["content"][0]) if "content" in query else ""
        lines = query.split('\n')                     # inputs are split to ['xxx','yyy','----','']
        query = lines[-1] if lines else ""      # Here we always get '', thus no result
        kbs = KnowledgebaseService.get_by_ids(self._param.kb_ids)
        if not kbs:
            return Retrieval.be_output("")
```

so the code will never got correct result.

I'm not sure why the input needs such a split here, so I just removed
the splitting, and it works well on my side.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2025-03-31 11:26:49 +08:00
liu an
aca4cf4369
Test: Added test cases for Retrieval Chunks HTTP API (#6649)
### What problem does this PR solve?

cover [retrieval
chunk](https://ragflow.io/docs/v0.17.2/http_api_reference#retrieve-chunks)
endpoints

### Type of change

- [x]  add test cases
2025-03-31 10:05:35 +08:00
Song Fuchang
9aa047257a
Fix agent completion requiring calling twice with parameters in begin component (#6659)
### What problem does this PR solve?

Fix #5418

Actually, the fix #4329 also works for agent flows with parameters, so
this PR just relaxes the `else` branch of that. With this PR, it works
fine on my side, may need more testing to make sure this does not break
something.

I guess the real problem may be deeply hidden in the code which relates
to conversation and canvas execution. After a few hours of debugging, I
see the only difference between with and without parameters in `begin`
component, is the `history` field of canvas data. When the `begin`
component contains some parameters, the debug log shows:

```
025-03-29 19:50:38,521 DEBUG    356590 {
            "component_name": "Begin",
            "params": {"output_var_name": "output", "message_history_window_size": 22, "query": [{"type": "fileUrls", "key": "fileUrls", "name": "files", "optional": true, "value": "问题.txt\n今天天气怎么样"}], "inputs": [], "debug_inputs": [], "prologue": "你好! 我是你的助理,有什么可以帮到你的吗?", "output": null},
            "output": null,
            "inputs": []
        }, history: [["user", "请回答我上传文件中的问题。"]], kwargs: {"stream": false}
2025-03-29 19:50:38,523 DEBUG    356590 {
            "component_name": "Answer",
            "params": {"output_var_name": "output", "message_history_window_size": 22, "query": [], "inputs": [], "debug_inputs": [], "post_answers": [], "output": null},
            "output": null,
            "inputs": []
        }, history: [["user", "请回答我上传文件中的问题。"]], kwargs: {"stream": false}
```

Then it does not go further along the flow.

When the `begin` component does not contain any parameter, the debug log
shows:

```
2025-03-29 19:41:13,518 DEBUG    353596 {
            "component_name": "Begin",
            "params": {"output_var_name": "output", "message_history_window_size": 22, "query": [], "inputs": [], "debug_inputs": [], "prologue": "你好! 我是你的助理,有什么可以帮到你的吗?", "output": null},
            "output": null,
            "inputs": []
        }, history: [], kwargs: {"stream": false}
2025-03-29 19:41:13,520 DEBUG    353596 {
            "component_name": "Answer",
            "params": {"output_var_name": "output", "message_history_window_size": 22, "query": [], "inputs": [], "debug_inputs": [], "post_answers": [], "output": null},
            "output": null,
            "inputs": []
        }, history: [], kwargs: {"stream": false}
2025-03-29 19:41:13,556 INFO     353596 127.0.0.1 - - [29/Mar/2025 19:41:13] "POST /api/v1/agents/fee6886a0c6f11f09b48eb8798e9aa9b/sessions?user_id=123 HTTP/1.1" 200 -
2025-03-29 19:41:21,115 DEBUG    353596 Canvas.prepare2run: Retrieval:LateGuestsNotice
2025-03-29 19:41:21,116 DEBUG    353596 {
            "component_name": "Retrieval",
            "params": {"output_var_name": "output", "message_history_window_size": 22, "query": [], "inputs": [], "debug_inputs": [], "similarity_threshold": 0.2, "keywords_similarity_weight": 0.3, "top_n": 8, "top_k": 1024, "kb_ids": ["9aca3c700c5911f0811caf35658b9385"], "rerank_id": "", "empty_response": "", "tavily_api_key": "", "use_kg": false, "output": null},
            "output": null,
            "inputs": []
        }, history: [["user", "请回答我上传文件中的问题。"]], kwargs: {"stream": false}
```

It correctly goes along the flow and generates correct answer.

You can see the difference: when the `begin` component has any
parameter, the `history` field is filled from the beginning, while it is
just `[]` if the `begin` component has no parameter.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2025-03-31 09:57:56 +08:00
Zhichang Yu
65a8cd1772
Fix knowledge_graph_kwd on infinity. Close #6476 and #6624 (#6651)
### What problem does this PR solve?

Fix knowledge_graph_kwd on infinity. Close #6476 and #6624

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-28 22:05:40 +08:00
Kevin Hu
563a84beaf
Docs: fix retrieval docs. (#6633)
### What problem does this PR solve?


### Type of change

- [x] Documentation Update
2025-03-28 16:03:37 +08:00
Zhichang Yu
d32a35d8fd
Fix entity_types. Close #6287 and #6608 (#6632)
### What problem does this PR solve?

Fix entity_types. Close #6287 and #6608

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-28 15:00:24 +08:00