### What problem does this PR solve?
When using the online large model API knowledge base to extract
knowledge graphs, frequent Rate Limit Errors were triggered,
causing document parsing to fail. This commit fixes the issue by
optimizing API calls in the following way:
Added exponential backoff and jitter to the API call to reduce the
frequency of Rate Limit Errors.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
###Address Problem:
The original implementation used re.sub(r"(\\\"|\")", "", content) which
stripped all quotes from the processed content. While this worked for
simple Jinja2-rendered templates, it caused formatting issues when :
-Quotes were required in the final output (e.g., JSON, Python Code
strings)
###Solution:
1. Selective JSON Serialization.
2. Removed Global Quote Removal
### What problem does this PR solve?
This PR addresses an issue in template processing where all quotation
marks (" and \") were being removed from content, potentially corrupting
string formatting in rendered outputs. **In fact, extra quotes is
generated by json.dumps(v, ensure_ascii=False).**
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
This PR resolves the issue of multiple top-level packages being detected
in the Python project, which caused errors when using uv pip install.
The problem occurred because the project had multiple directories files
at the root level, leading to a flat-layout error.
To fix this, the pyproject.toml file was updated to explicitly list the
packages using the [tool.setuptools] section. This ensures that the
correct packages are included during installation, avoiding the
flat-layout error.
Type of change
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: If the Transfer item is disabled, the item cannot be edited. #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Adds hierarchical title path tracking for tables in DOCX documents to
improve context association. Previously, extracted tables lacked
positional context within document structure.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix the error where the Ollama embeddings interface returns a “500
Internal Server Error” when using models such as xiaobu-embedding-v2 for
embedding.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- Introduce the `check_duplicate_ids` function in `dataset.py` and
`doc.py` to check for and handle duplicate IDs.
- Update the deletion operation to ensure that when deleting datasets
and documents, error messages regarding duplicate IDs can be returned.
- Implement the `check_duplicate_ids` function in `api_utils.py` to
return unique IDs and error messages for duplicate IDs.
### What problem does this PR solve?
Close https://github.com/infiniflow/ragflow/issues/6234
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: wenju.li <wenju.li@deepctr.cn>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Feat: Add user registration toggle feature. Added a user registration
toggle REGISTER_ENABLED in the settings and .env config file. The user
creation interface now checks the state of this toggle to control the
enabling and disabling of the user registration feature.
the front-end implementation is done, the registration button does not
appear if registration is not allowed. I did the actual tests on my
local server and it worked smoothly.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: wenju.li <wenju.li@deepctr.cn>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
Fix: Optimized the get_by_id method to resolve the issue of missing
exceptions and improve query performance
### What problem does this PR solve?
Optimized the get_by_id method to resolve the issue of missing
exceptions and improve query performance.
Optimization details:
1. The original method used a custom query method that required
concatenating SQL, which impacted performance.
2. The query method returned a list, which needed to be accessed by
index, posing a risk of index out-of-bounds errors.
3. The original method used except Exception to catch all errors, which
is not a best practice in Python programming and may lead to missing
exceptions. The get_or_none method accurately catches DoesNotExist
errors while allowing other errors to be raised normally.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Performance Improvement
### What problem does this PR solve?
Call register_scripts on connecting redis
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### Change Content
- A new function `load_env_file` has been added to load environment
variables from a .env file in the current script directory.
- If the .env file exists, the variables within it will be loaded; if it
does not exist, a warning message will be output.
I found this issue while testing this pr:
https://github.com/infiniflow/ragflow/pull/6327. The locally started
server did not read the REGISTER_ENABLED variables in the .env. The
result has always been the default True
### What problem does this PR solve?
Follow the tutorial in the README.md to start from source code. base's
container that is es、redis,etc will load .env. Therefore,
`launch_backend_service.sh` should also load .env to be consistent with
the configuration of the docker container when it was started
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
- Fix incorrect progress check condition that prevented re-parsing of
completed documents
- Allow parsing for documents with progress 0.0 (not started) or 1.0
(completed)
- Only block parsing for documents currently in progress (0.0 < progress
< 1.0)
Close#6312
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Fix: Resolved a bug where sibling components in Canvas were not
restricted to fetching data from the upstream when parallel components
were present.
Issue: When parallel components existed in Canvas, sibling components
incorrectly fetched data without being limited to the upstream scope,
causing data retrieval issues.
Solution: Adjusted the data fetching logic to ensure sibling components
only retrieve data from the upstream scope.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Add fallback for PDF figure parser
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Optimize setting configuration initialization to resolve Minio
initialization error caused by using a specific storage.
Reproduction Scenario:
Using Aliyun OSS as the backend storage with the STORAGE_IMPL
environment variable set to OSS.
The service_conf.yaml.template configuration file contains OSS-related
configurations, while other storage configurations are commented out.
When the service starts, it still attempts to initialize the Minio
storage. Since there is no Minio configuration in
service_conf.yaml.template, it results in an error due to the missing
configuration file.
Optimization Measures:
Automatically determine the required initialization configuration based
on the environment variable.
Do not initialize configurations for unused resources.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Add VLM-boosted PDF parser if VLM is set.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: In the Agent's workflow, the input content cannot be wrapped, and
\n will not work, otherwise an error will be reported #6241
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
When using LLM for auto-tag, if there are no examples, the tag format
generated by LLM may be wrong. This will cause Elasticsearch insert
errors. Adding basic examples can avoid this problem.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Feat: Alter TreeView component #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Add history version save
- Allows users to view and download agent files by version revision
history

_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>