3031 Commits

Author SHA1 Message Date
liu an
590b9dabab Docs: update for v0.19.0 (#7823)
### What problem does this PR solve?

update for v0.19.0

### Type of change

- [x] Documentation Update
2025-05-23 18:25:47 +08:00
writinwaters
c283ea57fd Docs: Added v0.19.0 release notes (#7818)
### What problem does this PR solve?

### Type of change

- [x] Documentation Update
2025-05-23 18:25:33 +08:00
Yongteng Lei
50ff16e7a4 Feat: add claude4 models (#7809)
### What problem does this PR solve?

Add claude4 models.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-05-23 18:25:13 +08:00
Yongteng Lei
453287b06b Feat: more robust fallbacks for citations (#7801)
### What problem does this PR solve?

Add more robust fallbacks for citations

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
2025-05-23 18:24:55 +08:00
liu an
e166f132b3 Feat: change default models (#7777)
### What problem does this PR solve?

change default models to buildin models
https://github.com/infiniflow/ragflow/issues/7774

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-05-23 18:21:25 +08:00
Yongteng Lei
42f4d4dbc8 Fix: wrong type hint (#7738)
### What problem does this PR solve?

Wrong hint type. #7729.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-05-23 18:21:06 +08:00
Yongteng Lei
7cb8368e0f Feat: sandox enhancement (#7739)
### What problem does this PR solve?

1. Add sandbox options for max memory and timeout.
2. ​Malicious code detection for Python only.​​

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-05-23 18:20:51 +08:00
Sol
0d7cfce6e1
Update rag/nlp/query.py (#7816)
### What problem does this PR solve?
Fix tokenizer resulting in low recall

![37743d3a495f734aa69f1e173fa77457](https://github.com/user-attachments/assets/1394757e-8fcb-4f87-96af-a92716144884)

![4aba633a17f34269a4e17e84fafb34c4](https://github.com/user-attachments/assets/a1828e32-3e17-4394-a633-ba3f09bd506d)

![image](https://github.com/user-attachments/assets/61308f32-2a4f-44d5-a034-d65bbec554ef)



### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-05-23 17:13:37 +08:00
Yongteng Lei
2d7c1368f0
Feat: add code_executor_manager (#7814)
### What problem does this PR solve?

Add code_executor_manager. #4977.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-05-23 16:33:38 +08:00
Stephen Hu
db4371c745
Fix: Improve First Chunk Size (#7806)
### What problem does this PR solve?

https://github.com/infiniflow/ragflow/issues/7790

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-05-23 14:30:19 +08:00
balibabu
e6cd799d8a
Feat: Translate the begin operator #3221 (#7811)
### What problem does this PR solve?

Feat: Translate the begin operator #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-05-23 14:18:27 +08:00
writinwaters
ab29b58316
Docs: Added instructions on cross-language search (#7812)
### What problem does this PR solve?



### Type of change


- [x] Documentation Update
2025-05-23 14:18:14 +08:00
balibabu
3f037c9786
Feat: Reconstruct the QueryTable of BeginForm using shandcn #3221 (#7807)
### What problem does this PR solve?

Feat: Reconstruct the QueryTable of BeginForm using shandcn #3221
### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-05-23 12:31:05 +08:00
Sol
53b991aa0e
Fix backquotes in text2sql causing execution errors (#7793)
### What problem does this PR solve?
Remove the backquotes in the sql generated by LLM to prevent it from
causing execution errors.

![image](https://github.com/user-attachments/assets/40d57ef7-b812-402a-b469-5793e466b83d)


![image](https://github.com/user-attachments/assets/d0a9bc17-ff5a-43cb-90cb-b2b3827b00b0)


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-05-23 09:49:45 +08:00
balibabu
9e80f39caa
Feat: Synchronize BeginForm's query data to the canvas #3221 (#7798)
### What problem does this PR solve?

Feat: Synchronize BeginForm's query data to the canvas #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-05-23 09:49:14 +08:00
Hayden Zhou
bdc2b74e8f
Fix baidu request error (#7799)
### What problem does this PR solve?

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

Co-authored-by: xiaohzho <xiaohzho@cisco.com>
2025-05-23 09:48:55 +08:00
writinwaters
1fd92e6bee
Docs: RAGFlow does not suppport batch metadata setting (#7795)
### What problem does this PR solve?

_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change


- [x] Documentation Update
2025-05-22 17:02:23 +08:00
balibabu
02fd381072
Feat: Verify the parameters of the begin operator #3221 (#7794)
### What problem does this PR solve?

Feat: Verify the parameters of the begin operator #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-05-22 16:58:47 +08:00
balibabu
b6f3a6a68a
Feat: Refactor BeginForm with shadcn #3221 (#7792)
### What problem does this PR solve?

Feat: Refactor BeginForm with shadcn #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-05-22 15:33:40 +08:00
kunger309
ae70512f5d
fix:When creating a new assistant, an avatar was uploaded, but when selecting the assistant to start a new chat, the default avatar still appears in the chat window instead of the one uploaded during creation (#7769)
### What problem does this PR solve?

_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2025-05-22 11:50:02 +08:00
Emmanuel Ferdman
d4a123d6dd
Fix: resolve regex library warnings (#7782)
### What problem does this PR solve?
This small PR resolves the regex library warnings showing in Python3.11:
```python
DeprecationWarning: 'count' is passed as positional argument
```

### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [x] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):

Signed-off-by: Emmanuel Ferdman <emmanuelferdman@gmail.com>
2025-05-22 10:06:28 +08:00
Stephen Hu
ce816edb5f
Fix: improve task cancel lag (#7765)
### What problem does this PR solve?

https://github.com/infiniflow/ragflow/issues/7761

but it may be difficult to achieve 0 delay (which need to pass the
cancel token to all parts)

Another solution is just 0 delay effect at UI.
And task will stop latter

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-05-22 09:28:08 +08:00
balibabu
ac2643700b
Feat: Add return value widget to CodeForm #3221 (#7776)
### What problem does this PR solve?
Feat: Add return value widget  to CodeForm #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-05-21 19:35:27 +08:00
balibabu
558b252c5a
Feat: Switching the programming language of the code operator will switch the corresponding language template #3221 (#7770)
### What problem does this PR solve?

Feat: Switching the programming language of the code operator will
switch the corresponding language template #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-05-21 18:22:06 +08:00
balibabu
754a5e1cee
Feat: Fixed the issue where the page would refresh continuously when opening the sheet on the right side of the canvas #3221 (#7756)
### What problem does this PR solve?

Feat: Fixed the issue where the page would refresh continuously when
opening the sheet on the right side of the canvas #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-05-21 17:03:16 +08:00
Stephen Hu
e3e7c7ddaa
Feat: delete useless image blobs when task executor meet edge cases (#7727)
### What problem does this PR solve?

delete useless image blobs when the task executor meets edge cases

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-05-21 10:22:30 +08:00
writinwaters
76b278af8e
0519 pdfparser (#7747)
### What problem does this PR solve?


### Type of change


- [x] Documentation Update
2025-05-20 19:41:55 +08:00
balibabu
1c6320828c
Feat: Rename agent #3221 (#7740)
### What problem does this PR solve?

Feat: Rename agent #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-05-20 19:13:19 +08:00
balibabu
d72468426e
Feat: Render the agent list page by page #3221 (#7736)
### What problem does this PR solve?

Feat: Render the agent list page by page #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-05-20 16:03:55 +08:00
balibabu
796f4032b8
Feat: Migrate the code operator to the new agent. #3221 (#7731)
### What problem does this PR solve?

Feat: Migrate the code operator to the new agent. #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-05-20 15:30:56 +08:00
balibabu
1ae7b942d9
Feat: The image displayed in the reply message can also be clicked to display the location of the source document where the slice is located #7623 (#7723)
### What problem does this PR solve?

Feat: The image displayed in the reply message can also be clicked to
display the location of the source document where the slice is located
#7623

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-05-20 10:40:53 +08:00
liu an
fed1221302
Refa: HTTP API list datasets / test cases / docs (#7720)
### What problem does this PR solve?

This PR introduces Pydantic-based validation for the list datasets HTTP
API, improving code clarity and robustness. Key changes include:

Pydantic Validation
Error Handling
Test Updates
Documentation Updates

### Type of change

- [x] Documentation Update
- [x] Refactoring
2025-05-20 09:58:26 +08:00
Chaoxi Weng
6ed81d6774
Feat: Add OAuth state parameter for CSRF protection (#7709)
### What problem does this PR solve?

Add OAuth `state` parameter for CSRF protection:
- Updated `get_authorization_url()` to accept an optional state
parameter
- Generated a unique state value during OAuth login and stored in
session
- Verified state parameter in callback to ensure request legitimacy

This PR follows OAuth 2.0 security best practices by ensuring that the
authorization request originates from the same user who initiated the
flow.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-05-20 09:40:31 +08:00
donblack01
115850945e
Fix:When you create a new API module named xxxa_api, the access route will become xxx instead of xxxa. For example, when I create a new API module named 'data_api', the access route will become 'dat' instead of 'data (#7325)
### What problem does this PR solve?

Fix:When you create a new API module named xxxa_api, the access route
will become xxx instead of xxxa. For example, when I create a new API
module named 'data_api', the access route will become 'dat' instead of
'data'
Fix:Fixed the issue where the new knowledge base would not be renamed
when there was a knowledge base with the same name

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: tangyu <1@1.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-05-20 09:39:26 +08:00
balibabu
8e87436725
Feat: Modify the Python language template code of the code operator #4977 (#7714)
### What problem does this PR solve?

Feat: Modify the Python language template code of the code operator
#4977
### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-05-19 19:34:43 +08:00
Yongteng Lei
e8e2a95165
Refa: more fallbacks for bad citation format (#7710)
### What problem does this PR solve?

More fallbacks for bad citation format

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Refactoring
2025-05-19 19:34:05 +08:00
Yongteng Lei
b908c33464
Fix: uncaptured image data with position information (#7683)
### What problem does this PR solve?

Fixed uncaptured figure data with position information. #7466, #7681

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-05-19 19:33:28 +08:00
Yongteng Lei
0ebf05440e
Feat: repair corrupted PDF files on upload automatically (#7693)
### What problem does this PR solve?

Try the best to repair corrupted PDF files on upload automatically.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-05-19 14:54:06 +08:00
liwenju0
7df1bd4b4a
When creating an assistant, no dataset is specified, a different default system promt is used (#7690)
### What problem does this PR solve?

- Updated the dialog settings function to add a default prompt
configuration for no dataset.
- The prompt configuration will be determined based on the presence of
`kb_ids` in the request.


### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (Non-breaking change, adding functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):

---------

Co-authored-by: wenju.li <wenju.li@deepctr.cn>
2025-05-19 11:33:54 +08:00
S0b3Rr
5d21cc3660
fix: Fix the problem that concurrent execution limit in task executor fails and causes OOM (issue#7580) (#7700)
### What problem does this PR solve?

## Cause of the bug:
During the execution process, due to improper use of trio
CapacityLimiter, the configuration parameter MAX_CONCURRENT_TASKS is
invalid, causing the executor to take out a large number of tasks from
the Redis queue at one time.

This behavior will cause the task executor to occupy too much memory and
be killed by the OS when a large number of tasks exist at the same time.
As a result, all executing tasks are suspended.

## Fix:
Added the task_manager method to the entry of /rag/svr/task_executor.py
to make CapacityLimiter effective. Deleted the invalid async with
statement.

## Fix result:
After testing, the task executor execution meets expectations, that is:
concurrent execution of up to $MAX_CONCURRENT_TASKS tasks.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2025-05-19 10:25:56 +08:00
Song Fuchang
b0275b8483
Fix: value too long error for chat name (#7697)
### What problem does this PR solve?

Hello, when I input a very long line in the chat input box, it will fail
with following error:

```
2025-05-17 16:11:26,004 ERROR    182558 value too long for type character varying(255)
Traceback (most recent call last):
  File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 3291, in execute_sql
    cursor.execute(sql, params or ())
psycopg2.errors.StringDataRightTruncation: value too long for type character varying(255)


During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/var/home/sfc/Projects/ragflow/api/apps/conversation_app.py", line 68, in set_conversation
    ConversationService.save(**conv)
  File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 3128, in inner
    return fn(*args, **kwargs)
  File "/var/home/sfc/Projects/ragflow/api/db/services/common_service.py", line 145, in save
    return cls.save_n(**kwargs)
  File "/var/home/sfc/Projects/ragflow/api/db/services/common_service.py", line 139, in save_n
    sample_obj = cls.model(**kwargs).save(force_insert=True)
  File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 6923, in save
    pk = self.insert(**field_dict).execute()
  File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 2011, in inner
    return method(self, database, *args, **kwargs)
  File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 2082, in execute
    return self._execute(database)
  File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 2887, in _execute
    return super(Insert, self)._execute(database)
  File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 2598, in _execute
    cursor = self.execute_returning(database)
  File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 2605, in execute_returning
    cursor = database.execute(self)
  File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 3299, in execute
    return self.execute_sql(sql, params)
  File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 3289, in execute_sql
    with __exception_wrapper__:
  File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 3059, in __exit__
    reraise(new_type, new_type(exc_value, *exc_args), traceback)
  File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 192, in reraise
    raise value.with_traceback(tb)
  File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 3291, in execute_sql
    cursor.execute(sql, params or ())
peewee.DataError: value too long for type character varying(255)
```

This PR fix it by truncate the `name` field in the `set_conversation`
method in the `conversation_app.py`.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2025-05-19 10:25:41 +08:00
writinwaters
86c6fee320
Docs: Added an FAQ (#7694)
### What problem does this PR solve?

### Type of change

- [x] Documentation Update
2025-05-19 09:58:10 +08:00
writinwaters
c0bee906d2
Docs: Added a guide on switching document engine (#7692)
### What problem does this PR solve?

### Type of change

- [x] Documentation Update
2025-05-16 19:02:36 +08:00
balibabu
bfaa469b9a
Feat: Rendering recall test page #3221 (#7689)
### What problem does this PR solve?

Feat: Rendering recall test page #3221
### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-05-16 18:56:48 +08:00
balibabu
d73a08b9eb
Fix: Fixed the issue where message references could not be displayed (#7691)
### What problem does this PR solve?

Fix: Fixed the issue where message references could not be displayed

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-05-16 18:56:25 +08:00
Song Fuchang
a1f06a4fdc
Feat: Support tool calling in Generate component (#7572)
### What problem does this PR solve?

Hello, our use case requires LLM agent to invoke some tools, so I made a
simple implementation here.

This PR does two things:

1. A simple plugin mechanism based on `pluginlib`:

This mechanism lives in the `plugin` directory. It will only load
plugins from `plugin/embedded_plugins` for now.

A sample plugin `bad_calculator.py` is placed in
`plugin/embedded_plugins/llm_tools`, it accepts two numbers `a` and `b`,
then give a wrong result `a + b + 100`.

In the future, it can load plugins from external location with little
code change.

Plugins are divided into different types. The only plugin type supported
in this PR is `llm_tools`, which must implement the `LLMToolPlugin`
class in the `plugin/llm_tool_plugin.py`.
More plugin types can be added in the future.

2. A tool selector in the `Generate` component:

Added a tool selector to select one or more tools for LLM:


![image](https://github.com/user-attachments/assets/74a21fdf-9333-4175-991b-43df6524c5dc)

And with the `bad_calculator` tool, it results this with the `qwen-max`
model:


![image](https://github.com/user-attachments/assets/93aff9c4-8550-414a-90a2-1a15a5249d94)


### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):

Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
2025-05-16 16:32:19 +08:00
writinwaters
cb26564d50
Docs: Added contribution guidelines and sandbox-related tips (#7685)
### What problem does this PR solve?

### Type of change

- [x] Documentation Update
2025-05-16 16:28:21 +08:00
liu an
59705a1c1d
Test: change variable for ZHIPU_AI_API_KEY (#7684)
### What problem does this PR solve?

change variable for ZHIPU_AI_API_KEY

### Type of change

- [x] Update test case
2025-05-16 15:58:54 +08:00
Chaoxi Weng
205974c359
Docs: Improve oauth configuration documentation and examples (#7675)
### What problem does this PR solve?

Improve oauth configuration documentation and examples.

- Related pull requests: 
  - #7379
  - #7553
  - #7587
- Related issues:
  -  #3495
### Type of change

- [x] Documentation Update
2025-05-16 14:17:39 +08:00
liu an
04edf9729f
Test: use environment variable for ZHIPU_AI_API_KEY (#7680)
### What problem does this PR solve?

use environment variable for ZHIPU_AI_API_KEY

### Type of change

- [x] Test update
2025-05-16 13:51:21 +08:00