### What problem does this PR solve?
Try the best to repair corrupted PDF files on upload automatically.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
- Updated the dialog settings function to add a default prompt
configuration for no dataset.
- The prompt configuration will be determined based on the presence of
`kb_ids` in the request.
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (Non-breaking change, adding functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
---------
Co-authored-by: wenju.li <wenju.li@deepctr.cn>
### What problem does this PR solve?
## Cause of the bug:
During the execution process, due to improper use of trio
CapacityLimiter, the configuration parameter MAX_CONCURRENT_TASKS is
invalid, causing the executor to take out a large number of tasks from
the Redis queue at one time.
This behavior will cause the task executor to occupy too much memory and
be killed by the OS when a large number of tasks exist at the same time.
As a result, all executing tasks are suspended.
## Fix:
Added the task_manager method to the entry of /rag/svr/task_executor.py
to make CapacityLimiter effective. Deleted the invalid async with
statement.
## Fix result:
After testing, the task executor execution meets expectations, that is:
concurrent execution of up to $MAX_CONCURRENT_TASKS tasks.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Hello, when I input a very long line in the chat input box, it will fail
with following error:
```
2025-05-17 16:11:26,004 ERROR 182558 value too long for type character varying(255)
Traceback (most recent call last):
File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 3291, in execute_sql
cursor.execute(sql, params or ())
psycopg2.errors.StringDataRightTruncation: value too long for type character varying(255)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/var/home/sfc/Projects/ragflow/api/apps/conversation_app.py", line 68, in set_conversation
ConversationService.save(**conv)
File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 3128, in inner
return fn(*args, **kwargs)
File "/var/home/sfc/Projects/ragflow/api/db/services/common_service.py", line 145, in save
return cls.save_n(**kwargs)
File "/var/home/sfc/Projects/ragflow/api/db/services/common_service.py", line 139, in save_n
sample_obj = cls.model(**kwargs).save(force_insert=True)
File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 6923, in save
pk = self.insert(**field_dict).execute()
File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 2011, in inner
return method(self, database, *args, **kwargs)
File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 2082, in execute
return self._execute(database)
File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 2887, in _execute
return super(Insert, self)._execute(database)
File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 2598, in _execute
cursor = self.execute_returning(database)
File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 2605, in execute_returning
cursor = database.execute(self)
File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 3299, in execute
return self.execute_sql(sql, params)
File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 3289, in execute_sql
with __exception_wrapper__:
File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 3059, in __exit__
reraise(new_type, new_type(exc_value, *exc_args), traceback)
File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 192, in reraise
raise value.with_traceback(tb)
File "/var/home/sfc/Projects/ragflow/.venv/lib/python3.10/site-packages/peewee.py", line 3291, in execute_sql
cursor.execute(sql, params or ())
peewee.DataError: value too long for type character varying(255)
```
This PR fix it by truncate the `name` field in the `set_conversation`
method in the `conversation_app.py`.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Feat: Rendering recall test page #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Fixed the issue where message references could not be displayed
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Hello, our use case requires LLM agent to invoke some tools, so I made a
simple implementation here.
This PR does two things:
1. A simple plugin mechanism based on `pluginlib`:
This mechanism lives in the `plugin` directory. It will only load
plugins from `plugin/embedded_plugins` for now.
A sample plugin `bad_calculator.py` is placed in
`plugin/embedded_plugins/llm_tools`, it accepts two numbers `a` and `b`,
then give a wrong result `a + b + 100`.
In the future, it can load plugins from external location with little
code change.
Plugins are divided into different types. The only plugin type supported
in this PR is `llm_tools`, which must implement the `LLMToolPlugin`
class in the `plugin/llm_tool_plugin.py`.
More plugin types can be added in the future.
2. A tool selector in the `Generate` component:
Added a tool selector to select one or more tools for LLM:

And with the `bad_calculator` tool, it results this with the `qwen-max`
model:

### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
### What problem does this PR solve?
Improve oauth configuration documentation and examples.
- Related pull requests:
- #7379
- #7553
- #7587
- Related issues:
- #3495
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
Fix: Fixed the issue where the height of the chat page shared externally
did not fill the window #7460
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Launch sandbox from docker-compose.
#4977
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
---------
Co-authored-by: writinwaters <93570324+writinwaters@users.noreply.github.com>
### What problem does this PR solve?
Close#7655
Based on the codes atthe api_app, I think the reference is one-to-one
with the message
`
def fillin_conv(ans):
nonlocal conv, message_id
if not conv.reference:
conv.reference.append(ans["reference"])
else:
conv.reference[-1] = ans["reference"]
conv.message[-1] = {"role": "assistant", "content": ans["answer"], "id":
message_id}
ans["id"] = message_id
`
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Add code agent component.
#4977
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
This PR introduces Pydantic-based validation for the delete dataset HTTP
API, improving code clarity and robustness. Key changes include:
1. Pydantic Validation
2. Error Handling
3. Test Updates
4. Documentation Updates
### Type of change
- [x] Documentation Update
- [x] Refactoring
### What problem does this PR solve?
Feat: Fixed the issue where the dataset configuration page kept
refreshing #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: Use DOMPurify to filter out dangerous HTML #7668
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Add the JS code (or other) executor component to Agent. #4977
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Deprecate `/github_callback` route in favor of
`/oauth/callback/<channel>` for GitHub OAuth integration:
- Added GitHub OAuth support in the authentication module
- Introduced `GithubOAuthClient` with methods to fetch and normalize
user info
- Updated `CLIENT_TYPES` to include GitHub OAuth client
- Deprecated `/github_callback` route and suggested using the generic
`/oauth/callback/<channel>` route
---
- Related pull requests:
- #7379
- #7553
### Usage
- [Create a GitHub OAuth
App](https://github.com/settings/applications/new) to obtain the
`client_id` and `client_secret`, configure the authorization callback
url: `https://your-app.com/v1/user/oauth/callback/github`
- Edit `service_conf.yaml.template`:
```yaml
# ...
oauth:
github:
type: "github"
icon: "github"
display_name: "Github"
client_id: "your_client_id"
client_secret: "your_client_secret"
redirect_uri: "https://your-app.com/v1/user/oauth/callback/github"
# ...
```
### Type of change
- [x] Documentation Update
- [x] Refactoring (non-breaking change)
TOML-table-based project.license is deprecated as per PEP 639, see:
https://packaging.python.org/en/latest/guides/writing-pyproject-toml/#license-and-license-files
### What problem does this PR solve?
The following error when building project (e.g. `uv build`)
```
SetuptoolsDeprecationWarning: `project.license` as a TOML table is deprecated
!!
********************************************************************************
Please use a simple string containing a SPDX expression for `project.license`. You can also use `project.license-files`. (Both options available on setuptools>=77.0.0).
By 2026-Feb-18, you need to update your project and remove deprecated calls
or your builds will no longer be supported.
See https://packaging.python.org/en/latest/guides/writing-pyproject-toml/#license for details.
********************************************************************************
!!
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
For `uv package`/`uv pip install ".[full]"`, bug introduced in #6370:
* Removes erroneous (non-package) directories (`helm`, `flask_session`)
* Adds `mcp.server` package
* Resolves "warning: package would be ignored" ambiguity by changing
`sdk` to `sdk.python.ragflow_sdk`
* Resolves "error: package directory 'intergrations' does not exist" by
including `intergrations.chatgpt-on-wechat.plugins` explicitly
* Also rearranges packages in alphabetical order, for DX.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
When Delete Chunk Will Also Delete Chunk Related Image
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] Other (please describe): llm factories update
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] Documentation Update
### What problem does this PR solve?
Feat: Add data set configuration form #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Display inline (non-quoted) images in the chat and search modules
#7623
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Update 7 readme
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Add libjemalloc installation command. If the operating system does not
have the libjemalloc library, the execution of entrypoint.sh and
launch_backend_service.sh will be interrupted, and the
rag/svr/task_executor.py script will not be started normally.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [x] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Add frontend support for third-party login integration:
- Used `getLoginChannels` API to fetch available login channels from the
server
- Used `loginWithChannel` function to initiate login based on the
selected channel
- Refactored `useLoginWithGithub` hook to `useOAuthCallback` for
generalized OAuth callback handling
- Updated the login page to dynamically render third-party login buttons
based on the fetched channel list
- Styled third-party login buttons to improve user experience
- Removed unused code snippets
> This PR removes the previously hardcoded GitHub login button. Since
the functionality only worked when `location.host` was equal to
`demo.ragflow.io`, and the authentication logic is now based on
`login.ragflow.io`, this change does not affect the existing logic and
is considered a non-breaking change
---
#### Frontend Screenshot && Backend Configuration

```yaml
# docker/service_conf.yaml.template
# ...
oauth:
github:
icon: github
display_name: "Github"
# ...
custom_channel:
display_name: "OIDC"
# ...
custom_channel_2:
display_name: "OAuth2"
# ...
```
---
- Related pull requests:
- #7379
- #7521
- Related issues:
- #3495
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
- [x] Performance Improvement
### What problem does this PR solve?
Feat: Show images in reply messages #7608
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Fixed the issue where the chat page would jump after entering the
homepage #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Feat: Adjust the display position of recall test item images #7608
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Info of whether applying graph resolution and community extraction is
storage in `task["kb_parser_config"]`. However, previous code get
`graphrag_conf` from `task["parser_config"]`, making `with_resolution`
and `with_community` are always false.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Feat: Add FormContainer component #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)