ragflow

AI/ragflow

mirror of https://git.mirrors.martin98.com/https://github.com/infiniflow/ragflow.git synced 2025-08-02 12:20:39 +08:00

Author	SHA1	Message	Date
fansir	efc4796f01	Fix ratelimit errors during document parsing (#6413 ) ### What problem does this PR solve? When using the online large model API knowledge base to extract knowledge graphs, frequent Rate Limit Errors were triggered, causing document parsing to fail. This commit fixes the issue by optimizing API calls in the following way: Added exponential backoff and jitter to the API call to reduce the frequency of Rate Limit Errors. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [ ] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe):	2025-03-22 23:07:03 +08:00
Richard	d869e4d43f	Fix: Preserve quotes while handling variable substitution withTemplate component. (#6410 ) ###Address Problem: The original implementation used re.sub(r"(\\\"\|\")", "", content) which stripped all quotes from the processed content. While this worked for simple Jinja2-rendered templates, it caused formatting issues when : -Quotes were required in the final output (e.g., JSON, Python Code strings) ###Solution: 1. Selective JSON Serialization. 2. Removed Global Quote Removal ### What problem does this PR solve? This PR addresses an issue in template processing where all quotation marks (" and \") were being removed from content, potentially corrupting string formatting in rendered outputs. In fact, extra quotes is generated by json.dumps(v, ensure_ascii=False). ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-21 19:44:03 +08:00
liu an	8eefc8b5fe	Test: Added test cases for Add Chunk HTTP API (#6408 ) ### What problem does this PR solve? cover [add chunk](https://ragflow.io/docs/v0.17.2/http_api_reference#add-chunk) endpoints ### Type of change - [x] Add test cases	2025-03-21 19:16:30 +08:00
fansir	4091af4560	Fix: multiple top-level packages error in Python project (#6370 ) ### What problem does this PR solve? This PR resolves the issue of multiple top-level packages being detected in the Python project, which caused errors when using uv pip install. The problem occurred because the project had multiple directories files at the root level, leading to a flat-layout error. To fix this, the pyproject.toml file was updated to explicitly list the packages using the [tool.setuptools] section. This ensures that the correct packages are included during installation, avoiding the flat-layout error. Type of change ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-21 18:44:49 +08:00
Kevin Hu	394d1a86f6	Fix: add chunk, empty question issue. (#6405 ) ### What problem does this PR solve? #6404 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-21 18:44:12 +08:00
balibabu	d88964f629	Feat: If the Transfer item is disabled, the item cannot be edited. #3221 (#6409 ) ### What problem does this PR solve? Feat: If the Transfer item is disabled, the item cannot be edited. #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-03-21 18:42:52 +08:00
fansir	0e0ebaac5f	Feat: Adds hierarchical title path tracking for tables in DOCX documents to improve context association (#6374 ) ### What problem does this PR solve? Adds hierarchical title path tracking for tables in DOCX documents to improve context association. Previously, extracted tables lacked positional context within document structure. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-03-21 18:42:36 +08:00
Kevin Hu	8b7e53e643	Fix: miss calculate of token number. (#6401 ) ### What problem does this PR solve? #6308 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-21 17:30:38 +08:00
writinwaters	979cdc3626	UI updates. (#6398 ) ### What problem does this PR solve? Updated UI descriptions for delimiters and recommended chunk size ### Type of change - [x] Documentation Update	2025-03-21 16:50:20 +08:00
Kevin Hu	a2a4bfe3e3	Fix: change ollama default num_ctx. (#6395 ) ### What problem does this PR solve? #6163 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-21 16:22:03 +08:00
zhou	85480f6292	Fix: the error of Ollama embeddings interface returning "500 Internal Server Error" (#6350 ) ### What problem does this PR solve? Fix the error where the Ollama embeddings interface returns a “500 Internal Server Error” when using models such as xiaobu-embedding-v2 for embedding. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-21 15:25:48 +08:00
andy	f537b6ca00	Fix: flow list translate to zh (#6371 ) ### What problem does this PR solve? Add the Chinese translation of 'noMoreData' on the flow list page ### Type of change - [x] Refactoring	2025-03-21 14:54:12 +08:00
Kevin Hu	b5471978b0	Fix: add chunk api, empty content issue (#6390 ) ### What problem does this PR solve? #6387 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-21 14:05:59 +08:00
liwenju0	efdfb39a33	Feat: Add Duplicate ID Check and Update Deletion Logic (#6376 ) - Introduce the `check_duplicate_ids` function in `dataset.py` and `doc.py` to check for and handle duplicate IDs. - Update the deletion operation to ensure that when deleting datasets and documents, error messages regarding duplicate IDs can be returned. - Implement the `check_duplicate_ids` function in `api_utils.py` to return unique IDs and error messages for duplicate IDs. ### What problem does this PR solve? Close https://github.com/infiniflow/ragflow/issues/6234 ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Co-authored-by: wenju.li <wenju.li@deepctr.cn> Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-03-21 14:05:17 +08:00
Yingfeng	7cc5603a82	Fix broken discord invitation links (#6388 ) ### Type of change - [x] Documentation Update	2025-03-21 13:38:34 +08:00
Kevin Hu	9ed004e90d	Refa: control the simi for entity resolution. (#6386 ) ### What problem does this PR solve? #6352 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-21 13:16:34 +08:00
Kevin Hu	d83911b632	Fix: huggingface rerank model issue. (#6385 ) ### What problem does this PR solve? #6348 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-21 12:43:32 +08:00
Kevin Hu	bc58ecbfd7	Remove feature_request.md (#6383 ) ### What problem does this PR solve? ### Type of change - [x] Refactoring	2025-03-21 12:03:38 +08:00
Kevin Hu	221eae2c59	Refa: refine template. (#6382 ) ### What problem does this PR solve? ### Type of change - [x] Refactoring	2025-03-21 11:58:10 +08:00
Kevin Hu	37303e38ec	Refa: refine template. (#6381 ) ### What problem does this PR solve? ### Type of change - [x] Refactoring	2025-03-21 11:55:01 +08:00
Kevin Hu	b754bd523a	Fix: let quot stay. (#6377 ) ### What problem does this PR solve? #6337 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-21 11:47:42 +08:00
liwenju0	1bb990719e	Feat: Add user registration toggle feature (#6327 ) ### What problem does this PR solve? Feat: Add user registration toggle feature. Added a user registration toggle REGISTER_ENABLED in the settings and .env config file. The user creation interface now checks the state of this toggle to control the enabling and disabling of the user registration feature. the front-end implementation is done, the registration button does not appear if registration is not allowed. I did the actual tests on my local server and it worked smoothly. ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Co-authored-by: wenju.li <wenju.li@deepctr.cn> Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-03-21 09:38:15 +08:00
lgphone	7f80d7304d	Fix: Optimized the get_by_id method to resolve the issue of missing exceptions and improve query performance (#6320 ) Fix: Optimized the get_by_id method to resolve the issue of missing exceptions and improve query performance ### What problem does this PR solve? Optimized the get_by_id method to resolve the issue of missing exceptions and improve query performance. Optimization details: 1. The original method used a custom query method that required concatenating SQL, which impacted performance. 2. The query method returned a list, which needed to be accessed by index, posing a risk of index out-of-bounds errors. 3. The original method used except Exception to catch all errors, which is not a best practice in Python programming and may lead to missing exceptions. The get_or_none method accurately catches DoesNotExist errors while allowing other errors to be raised normally. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] Performance Improvement	2025-03-20 23:23:48 +08:00
Zhichang Yu	ca9c3e59fa	Call register_scripts on connecting redis (#6361 ) ### What problem does this PR solve? Call register_scripts on connecting redis ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-20 23:20:37 +08:00
Yongteng Lei	674f94228b	Chore: unify Ruff config and enable async checks (ASYNC, TRIO) (#6351 ) ### What problem does this PR solve? Unify Ruff config and enable async checks (ASYNC, TRIO) ### Type of change - [x] CI/CD or tooling improvement	2025-03-20 22:31:18 +08:00
liwenju0	ef7e96e486	Feat: Add the functionality to load environment variables from a .env file (#6331 ) ### Change Content - A new function `load_env_file` has been added to load environment variables from a .env file in the current script directory. - If the .env file exists, the variables within it will be loaded; if it does not exist, a warning message will be output. I found this issue while testing this pr: https://github.com/infiniflow/ragflow/pull/6327. The locally started server did not read the REGISTER_ENABLED variables in the .env. The result has always been the default True ### What problem does this PR solve? Follow the tutorial in the README.md to start from source code. base's container that is es、redis，etc will load .env. Therefore, `launch_backend_service.sh` should also load .env to be consistent with the configuration of the docker container when it was started ### Type of change - [ ] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe):	2025-03-20 18:35:04 +08:00
Zhichang Yu	dba0caa00b	Fix update_progress (#6340 ) ### What problem does this PR solve? Fix update_progress ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-20 17:01:28 +08:00
hy89	1d9ca172e3	Fix(api): correct document parsing progress check logic (#6318 ) - Fix incorrect progress check condition that prevented re-parsing of completed documents - Allow parsing for documents with progress 0.0 (not started) or 1.0 (completed) - Only block parsing for documents currently in progress (0.0 < progress < 1.0) Close #6312 --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-03-20 16:00:17 +08:00
so95	f0c4b28c6b	Fix: type import (#6328 ) ### What problem does this PR solve? fixed type import . ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-20 15:23:15 +08:00
科幻大脑	6784e0dfee	Fix: Resolved a bug where sibling components in Canvas were not restricted to fetching data from the upstream when parallel components were present. (#6315 ) ### What problem does this PR solve? Fix: Resolved a bug where sibling components in Canvas were not restricted to fetching data from the upstream when parallel components were present. Issue: When parallel components existed in Canvas, sibling components incorrectly fetched data without being limited to the upstream scope, causing data retrieval issues. Solution: Adjusted the data fetching logic to ensure sibling components only retrieve data from the upstream scope. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-20 15:06:18 +08:00
Kevin Hu	95497b4aab	Fix: adapt to old configurations. (#6321 ) ### What problem does this PR solve? ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-20 14:50:59 +08:00
Kevin Hu	5b04b7d972	Fix: rerank with vllm issue. (#6306 ) ### What problem does this PR solve? #6301 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-20 11:52:42 +08:00
liu an	4eb3a8e1cc	Test: Skip unstable 'stop parse documents' test cases (#6310 ) ### What problem does this PR solve? Skip unstable 'stop parse documents' test cases ### Type of change - [x] update test cases	2025-03-20 11:35:19 +08:00
Yongteng Lei	9611185eb4	Feat: add VLM-boosted DocX parser (#6307 ) ### What problem does this PR solve? Add VLM-boosted DocX parser ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-03-20 11:24:44 +08:00
Yongteng Lei	e4380843c4	Feat: add fallback for PDF figure parser (#6305 ) ### What problem does this PR solve? Add fallback for PDF figure parser ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-03-20 10:48:38 +08:00
lgphone	046f0bba74	Fix: optimize setting config initialization to resolve Minio initialization error (#6282 ) ### What problem does this PR solve? Optimize setting configuration initialization to resolve Minio initialization error caused by using a specific storage. Reproduction Scenario: Using Aliyun OSS as the backend storage with the STORAGE_IMPL environment variable set to OSS. The service_conf.yaml.template configuration file contains OSS-related configurations, while other storage configurations are commented out. When the service starts, it still attempts to initialize the Minio storage. Since there is no Minio configuration in service_conf.yaml.template, it results in an error due to the missing configuration file. Optimization Measures: Automatically determine the required initialization configuration based on the environment variable. Do not initialize configurations for unused resources. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-20 10:45:40 +08:00
writinwaters	e0c436b616	UI updates (#6290 ) ### What problem does this PR solve? ### Type of change - [x] Documentation Update	2025-03-20 10:26:16 +08:00
liu an	dbf2ee56c6	Test: Added test cases for Stop Parse Documents HTTP API (#6285 ) ### What problem does this PR solve? cover [stop parse documents](https://ragflow.io/docs/dev/http_api_reference#stop-parsing-documents) endpoints ### Type of change - [x] Add test cases	2025-03-20 09:42:50 +08:00
Yongteng Lei	1d6760dd84	Feat: add VLM-boosted PDF parser (#6278 ) ### What problem does this PR solve? Add VLM-boosted PDF parser if VLM is set. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-03-20 09:39:32 +08:00
so95	344727f9ba	Feat: add agent share team viewer (#6222 ) ### What problem does this PR solve? Allow member view agent # Canvas editor ![image](https://github.com/user-attachments/assets/042af36d-5fd1-43e2-acf7-05869220a1c1) # List agent ![image](https://github.com/user-attachments/assets/8b9c7376-780b-47ff-8f5c-6c0e7358158d) # Setting ![image](https://github.com/user-attachments/assets/6cb7d12a-7a66-4dd7-9acc-5b53ff79a10a) _Briefly describe what this PR aims to solve. Include background context that will help reviewers understand the purpose of the PR._ ### Type of change - [ ] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe): --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-03-19 19:04:13 +08:00
balibabu	d17ec26c56	Fix: In the Agent's workflow, the input content cannot be wrapped, and \n will not work, otherwise an error will be reported #6241 (#6284 ) ### What problem does this PR solve? Fix: In the Agent's workflow, the input content cannot be wrapped, and \n will not work, otherwise an error will be reported #6241 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-19 18:54:23 +08:00
lei	4236d81cfc	Docs: Update accelerate_doc_indexing.mdx (#6268 ) ### What problem does this PR solve? The word is written incorrectly ### Type of change - [x] Documentation Update	2025-03-19 18:04:03 +08:00
Zhichang Yu	bb869aca33	Fix get_unacked_iterator (#6280 ) ### What problem does this PR solve? Fix get_unacked_iterator. Close #6132 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-19 17:46:58 +08:00
zhou	9cad60fa6d	Fix: Add a basic example when the example of content_tagging is empty (#6276 ) ### What problem does this PR solve? When using LLM for auto-tag, if there are no examples, the tag format generated by LLM may be wrong. This will cause Elasticsearch insert errors. Adding basic examples can avoid this problem. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-19 17:30:47 +08:00
Kevin Hu	42e89e4a92	Fix: swich follow interact issue. (#6279 ) ### What problem does this PR solve? #6188 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-19 17:30:12 +08:00
balibabu	8daec9a4c5	Feat: Alter TreeView component #3221 (#6272 ) ### What problem does this PR solve? Feat: Alter TreeView component #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-03-19 15:44:59 +08:00
so95	53ac27c3ff	Feat: support agent version history. (#6130 ) ### What problem does this PR solve? Add history version save - Allows users to view and download agent files by version revision history ![image](https://github.com/user-attachments/assets/c300375d-8b97-4230-9fc4-83d148137132) _Briefly describe what this PR aims to solve. Include background context that will help reviewers understand the purpose of the PR._ ### Type of change - [ ] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe): --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-03-19 15:22:53 +08:00
Kevin Hu	e689532e6e	Fix: long api key issue. (#6267 ) ### What problem does this PR solve? #6248 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-19 13:30:40 +08:00
Kevin Hu	c2302abaf1	Fix: remove dup ids for APIs. (#6263 ) ### What problem does this PR solve? #6234 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-19 13:10:59 +08:00
Kevin Hu	8157285a79	Fix: Nan response for retrieval component. (#6265 ) ### What problem does this PR solve? #6247 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-19 13:10:45 +08:00

1 2 3 4 5 ...

2627 Commits