2627 Commits

Author SHA1 Message Date
fansir
efc4796f01
Fix ratelimit errors during document parsing (#6413)
### What problem does this PR solve?

When using the online large model API knowledge base to extract
knowledge graphs, frequent Rate Limit Errors were triggered,
causing document parsing to fail. This commit fixes the issue by
optimizing API calls in the following way:
Added exponential backoff and jitter to the API call to reduce the
frequency of Rate Limit Errors.


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2025-03-22 23:07:03 +08:00
Richard
d869e4d43f
Fix: Preserve quotes while handling variable substitution withTemplate component. (#6410)
###Address Problem:
The original implementation used re.sub(r"(\\\"|\")", "", content) which
stripped all quotes from the processed content. While this worked for
simple Jinja2-rendered templates, it caused formatting issues when :
-Quotes were required in the final output (e.g., JSON, Python Code
strings)

###Solution:
    1. Selective JSON Serialization.
    2. Removed Global Quote Removal

### What problem does this PR solve?

This PR addresses an issue in template processing where all quotation
marks (" and \") were being removed from content, potentially corrupting
string formatting in rendered outputs. **In fact, extra quotes is
generated by json.dumps(v, ensure_ascii=False).**

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-21 19:44:03 +08:00
liu an
8eefc8b5fe
Test: Added test cases for Add Chunk HTTP API (#6408)
### What problem does this PR solve?

cover [add
chunk](https://ragflow.io/docs/v0.17.2/http_api_reference#add-chunk)
endpoints

### Type of change

- [x] Add test cases
2025-03-21 19:16:30 +08:00
fansir
4091af4560
Fix: multiple top-level packages error in Python project (#6370)
### What problem does this PR solve?

This PR resolves the issue of multiple top-level packages being detected
in the Python project, which caused errors when using uv pip install.
The problem occurred because the project had multiple directories files
at the root level, leading to a flat-layout error.
To fix this, the pyproject.toml file was updated to explicitly list the
packages using the [tool.setuptools] section. This ensures that the
correct packages are included during installation, avoiding the
flat-layout error.
Type of change

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-21 18:44:49 +08:00
Kevin Hu
394d1a86f6
Fix: add chunk, empty question issue. (#6405)
### What problem does this PR solve?

#6404

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-21 18:44:12 +08:00
balibabu
d88964f629
Feat: If the Transfer item is disabled, the item cannot be edited. #3221 (#6409)
### What problem does this PR solve?

Feat: If the Transfer item is disabled, the item cannot be edited. #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-03-21 18:42:52 +08:00
fansir
0e0ebaac5f
Feat: Adds hierarchical title path tracking for tables in DOCX documents to improve context association (#6374)
### What problem does this PR solve?

Adds hierarchical title path tracking for tables in DOCX documents to
improve context association. Previously, extracted tables lacked
positional context within document structure.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-03-21 18:42:36 +08:00
Kevin Hu
8b7e53e643
Fix: miss calculate of token number. (#6401)
### What problem does this PR solve?

#6308

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-21 17:30:38 +08:00
writinwaters
979cdc3626
UI updates. (#6398)
### What problem does this PR solve?

Updated UI descriptions for delimiters and recommended chunk size

### Type of change

- [x] Documentation Update
2025-03-21 16:50:20 +08:00
Kevin Hu
a2a4bfe3e3
Fix: change ollama default num_ctx. (#6395)
### What problem does this PR solve?

#6163

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-21 16:22:03 +08:00
zhou
85480f6292
Fix: the error of Ollama embeddings interface returning "500 Internal Server Error" (#6350)
### What problem does this PR solve?

Fix the error where the Ollama embeddings interface returns a “500
Internal Server Error” when using models such as xiaobu-embedding-v2 for
embedding.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-21 15:25:48 +08:00
andy
f537b6ca00
Fix: flow list translate to zh (#6371)
### What problem does this PR solve?

Add the Chinese translation of 'noMoreData' on the flow list page

### Type of change

- [x] Refactoring
2025-03-21 14:54:12 +08:00
Kevin Hu
b5471978b0
Fix: add chunk api, empty content issue (#6390)
### What problem does this PR solve?

#6387

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-21 14:05:59 +08:00
liwenju0
efdfb39a33
Feat: Add Duplicate ID Check and Update Deletion Logic (#6376)
- Introduce the `check_duplicate_ids` function in `dataset.py` and
`doc.py` to check for and handle duplicate IDs.
- Update the deletion operation to ensure that when deleting datasets
and documents, error messages regarding duplicate IDs can be returned.
- Implement the `check_duplicate_ids` function in `api_utils.py` to
return unique IDs and error messages for duplicate IDs.


### What problem does this PR solve?

Close https://github.com/infiniflow/ragflow/issues/6234

### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: wenju.li <wenju.li@deepctr.cn>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-03-21 14:05:17 +08:00
Yingfeng
7cc5603a82
Fix broken discord invitation links (#6388)
### Type of change

- [x] Documentation Update
2025-03-21 13:38:34 +08:00
Kevin Hu
9ed004e90d
Refa: control the simi for entity resolution. (#6386)
### What problem does this PR solve?

#6352

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-21 13:16:34 +08:00
Kevin Hu
d83911b632
Fix: huggingface rerank model issue. (#6385)
### What problem does this PR solve?

#6348

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-21 12:43:32 +08:00
Kevin Hu
bc58ecbfd7
Remove feature_request.md (#6383)
### What problem does this PR solve?


### Type of change


- [x] Refactoring
2025-03-21 12:03:38 +08:00
Kevin Hu
221eae2c59
Refa: refine template. (#6382)
### What problem does this PR solve?

### Type of change


- [x] Refactoring
2025-03-21 11:58:10 +08:00
Kevin Hu
37303e38ec
Refa: refine template. (#6381)
### What problem does this PR solve?

### Type of change

- [x] Refactoring
2025-03-21 11:55:01 +08:00
Kevin Hu
b754bd523a
Fix: let quot stay. (#6377)
### What problem does this PR solve?

#6337

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-21 11:47:42 +08:00
liwenju0
1bb990719e
Feat: Add user registration toggle feature (#6327)
### What problem does this PR solve?

Feat: Add user registration toggle feature. Added a user registration
toggle REGISTER_ENABLED in the settings and .env config file. The user
creation interface now checks the state of this toggle to control the
enabling and disabling of the user registration feature.

the front-end implementation is done, the registration button does not
appear if registration is not allowed. I did the actual tests on my
local server and it worked smoothly.
### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: wenju.li <wenju.li@deepctr.cn>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-03-21 09:38:15 +08:00
lgphone
7f80d7304d
Fix: Optimized the get_by_id method to resolve the issue of missing exceptions and improve query performance (#6320)
Fix: Optimized the get_by_id method to resolve the issue of missing
exceptions and improve query performance

### What problem does this PR solve?

Optimized the get_by_id method to resolve the issue of missing
exceptions and improve query performance.
Optimization details:
1. The original method used a custom query method that required
concatenating SQL, which impacted performance.
2. The query method returned a list, which needed to be accessed by
index, posing a risk of index out-of-bounds errors.
3. The original method used except Exception to catch all errors, which
is not a best practice in Python programming and may lead to missing
exceptions. The get_or_none method accurately catches DoesNotExist
errors while allowing other errors to be raised normally.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] Performance Improvement
2025-03-20 23:23:48 +08:00
Zhichang Yu
ca9c3e59fa
Call register_scripts on connecting redis (#6361)
### What problem does this PR solve?

Call register_scripts on connecting redis

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-20 23:20:37 +08:00
Yongteng Lei
674f94228b
Chore: unify Ruff config and enable async checks (ASYNC, TRIO) (#6351)
### What problem does this PR solve?

Unify Ruff config and enable async checks (ASYNC, TRIO)

### Type of change

- [x] CI/CD or tooling improvement
2025-03-20 22:31:18 +08:00
liwenju0
ef7e96e486
Feat: Add the functionality to load environment variables from a .env file (#6331)
### Change Content

- A new function `load_env_file` has been added to load environment
variables from a .env file in the current script directory.
- If the .env file exists, the variables within it will be loaded; if it
does not exist, a warning message will be output.

I found this issue while testing this pr:
https://github.com/infiniflow/ragflow/pull/6327. The locally started
server did not read the REGISTER_ENABLED variables in the .env. The
result has always been the default True
### What problem does this PR solve?

Follow the tutorial in the README.md to start from source code. base's
container that is es、redis,etc will load .env. Therefore,
`launch_backend_service.sh` should also load .env to be consistent with
the configuration of the docker container when it was started

### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2025-03-20 18:35:04 +08:00
Zhichang Yu
dba0caa00b
Fix update_progress (#6340)
### What problem does this PR solve?

Fix update_progress

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-20 17:01:28 +08:00
hy89
1d9ca172e3
Fix(api): correct document parsing progress check logic (#6318)
- Fix incorrect progress check condition that prevented re-parsing of
completed documents
- Allow parsing for documents with progress 0.0 (not started) or 1.0
(completed)
- Only block parsing for documents currently in progress (0.0 < progress
< 1.0)

Close #6312

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-03-20 16:00:17 +08:00
so95
f0c4b28c6b
Fix: type import (#6328)
### What problem does this PR solve?

fixed type import .

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-20 15:23:15 +08:00
科幻大脑
6784e0dfee
Fix: Resolved a bug where sibling components in Canvas were not restricted to fetching data from the upstream when parallel components were present. (#6315)
### What problem does this PR solve?

Fix: Resolved a bug where sibling components in Canvas were not
restricted to fetching data from the upstream when parallel components
were present.
Issue: When parallel components existed in Canvas, sibling components
incorrectly fetched data without being limited to the upstream scope,
causing data retrieval issues.
Solution: Adjusted the data fetching logic to ensure sibling components
only retrieve data from the upstream scope.
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-20 15:06:18 +08:00
Kevin Hu
95497b4aab
Fix: adapt to old configurations. (#6321)
### What problem does this PR solve?

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-20 14:50:59 +08:00
Kevin Hu
5b04b7d972
Fix: rerank with vllm issue. (#6306)
### What problem does this PR solve?

#6301

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-20 11:52:42 +08:00
liu an
4eb3a8e1cc
Test: Skip unstable 'stop parse documents' test cases (#6310)
### What problem does this PR solve?

 Skip unstable 'stop parse documents' test cases

### Type of change

- [x] update test cases
2025-03-20 11:35:19 +08:00
Yongteng Lei
9611185eb4
Feat: add VLM-boosted DocX parser (#6307)
### What problem does this PR solve?

Add VLM-boosted DocX parser

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-03-20 11:24:44 +08:00
Yongteng Lei
e4380843c4
Feat: add fallback for PDF figure parser (#6305)
### What problem does this PR solve?

Add fallback for PDF figure parser

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-03-20 10:48:38 +08:00
lgphone
046f0bba74
Fix: optimize setting config initialization to resolve Minio initialization error (#6282)
### What problem does this PR solve?

Optimize setting configuration initialization to resolve Minio
initialization error caused by using a specific storage.

Reproduction Scenario:
Using Aliyun OSS as the backend storage with the STORAGE_IMPL
environment variable set to OSS.
The service_conf.yaml.template configuration file contains OSS-related
configurations, while other storage configurations are commented out.
When the service starts, it still attempts to initialize the Minio
storage. Since there is no Minio configuration in
service_conf.yaml.template, it results in an error due to the missing
configuration file.

Optimization Measures:
Automatically determine the required initialization configuration based
on the environment variable.
Do not initialize configurations for unused resources.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-20 10:45:40 +08:00
writinwaters
e0c436b616
UI updates (#6290)
### What problem does this PR solve?



### Type of change


- [x] Documentation Update
2025-03-20 10:26:16 +08:00
liu an
dbf2ee56c6
Test: Added test cases for Stop Parse Documents HTTP API (#6285)
### What problem does this PR solve?

cover [stop parse
documents](https://ragflow.io/docs/dev/http_api_reference#stop-parsing-documents)
endpoints

### Type of change

- [x] Add test cases
2025-03-20 09:42:50 +08:00
Yongteng Lei
1d6760dd84
Feat: add VLM-boosted PDF parser (#6278)
### What problem does this PR solve?

Add VLM-boosted PDF parser if VLM is set.

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-03-20 09:39:32 +08:00
so95
344727f9ba
Feat: add agent share team viewer (#6222)
### What problem does this PR solve?
Allow member view agent  
#  Canvas editor

![image](https://github.com/user-attachments/assets/042af36d-5fd1-43e2-acf7-05869220a1c1)
# List agent

![image](https://github.com/user-attachments/assets/8b9c7376-780b-47ff-8f5c-6c0e7358158d)
# Setting 

![image](https://github.com/user-attachments/assets/6cb7d12a-7a66-4dd7-9acc-5b53ff79a10a)
 
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-03-19 19:04:13 +08:00
balibabu
d17ec26c56
Fix: In the Agent's workflow, the input content cannot be wrapped, and \n will not work, otherwise an error will be reported #6241 (#6284)
### What problem does this PR solve?

Fix: In the Agent's workflow, the input content cannot be wrapped, and
\n will not work, otherwise an error will be reported #6241

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-19 18:54:23 +08:00
lei
4236d81cfc
Docs: Update accelerate_doc_indexing.mdx (#6268)
### What problem does this PR solve?
The word is written incorrectly

### Type of change

- [x] Documentation Update
2025-03-19 18:04:03 +08:00
Zhichang Yu
bb869aca33
Fix get_unacked_iterator (#6280)
### What problem does this PR solve?

Fix get_unacked_iterator. Close #6132 

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-19 17:46:58 +08:00
zhou
9cad60fa6d
Fix: Add a basic example when the example of content_tagging is empty (#6276)
### What problem does this PR solve?

When using LLM for auto-tag, if there are no examples, the tag format
generated by LLM may be wrong. This will cause Elasticsearch insert
errors. Adding basic examples can avoid this problem.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-19 17:30:47 +08:00
Kevin Hu
42e89e4a92
Fix: swich follow interact issue. (#6279)
### What problem does this PR solve?

#6188

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-19 17:30:12 +08:00
balibabu
8daec9a4c5
Feat: Alter TreeView component #3221 (#6272)
### What problem does this PR solve?

Feat: Alter TreeView component #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-03-19 15:44:59 +08:00
so95
53ac27c3ff
Feat: support agent version history. (#6130)
### What problem does this PR solve?
Add history version save
- Allows users to view and download agent files by version revision
history

![image](https://github.com/user-attachments/assets/c300375d-8b97-4230-9fc4-83d148137132)

_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-03-19 15:22:53 +08:00
Kevin Hu
e689532e6e
Fix: long api key issue. (#6267)
### What problem does this PR solve?

#6248

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-19 13:30:40 +08:00
Kevin Hu
c2302abaf1
Fix: remove dup ids for APIs. (#6263)
### What problem does this PR solve?

#6234

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-19 13:10:59 +08:00
Kevin Hu
8157285a79
Fix: Nan response for retrieval component. (#6265)
### What problem does this PR solve?

#6247

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-19 13:10:45 +08:00