ragflow

AI/ragflow

mirror of https://git.mirrors.martin98.com/https://github.com/infiniflow/ragflow.git synced 2025-08-05 16:20:42 +08:00

Author	SHA1	Message	Date
Marcus Yuan	c61df5dd25	Dynamic Context Window Size for Ollama Chat (#6582 ) # Dynamic Context Window Size for Ollama Chat ## Problem Statement Previously, the Ollama chat implementation used a fixed context window size of 32768 tokens. This caused two main issues: 1. Performance degradation due to unnecessarily large context windows for small conversations 2. Potential business logic failures when using smaller fixed sizes (e.g., 2048 tokens) ## Solution Implemented a dynamic context window size calculation that: 1. Uses a base context size of 8192 tokens 2. Applies a 1.2x buffer ratio to the total token count 3. Adds multiples of 8192 tokens based on the buffered token count 4. Implements a smart context size update strategy ## Implementation Details ### Token Counting Logic ```python def count_tokens(text): """Calculate token count for text""" # Simple calculation: 1 token per ASCII character # 2 tokens for non-ASCII characters (Chinese, Japanese, Korean, etc.) total = 0 for char in text: if ord(char) < 128: # ASCII characters total += 1 else: # Non-ASCII characters total += 2 return total ``` ### Dynamic Context Calculation ```python def _calculate_dynamic_ctx(self, history): """Calculate dynamic context window size""" # Calculate total tokens for all messages total_tokens = 0 for message in history: content = message.get("content", "") content_tokens = count_tokens(content) role_tokens = 4 # Role marker token overhead total_tokens += content_tokens + role_tokens # Apply 1.2x buffer ratio total_tokens_with_buffer = int(total_tokens * 1.2) # Calculate context size in multiples of 8192 if total_tokens_with_buffer <= 8192: ctx_size = 8192 else: ctx_multiplier = (total_tokens_with_buffer // 8192) + 1 ctx_size = ctx_multiplier * 8192 return ctx_size ``` ### Integration in Chat Method ```python def chat(self, system, history, gen_conf): if system: history.insert(0, {"role": "system", "content": system}) if "max_tokens" in gen_conf: del gen_conf["max_tokens"] try: # Calculate new context size new_ctx_size = self._calculate_dynamic_ctx(history) # Prepare options with context size options = { "num_ctx": new_ctx_size } # Add other generation options if "temperature" in gen_conf: options["temperature"] = gen_conf["temperature"] if "max_tokens" in gen_conf: options["num_predict"] = gen_conf["max_tokens"] if "top_p" in gen_conf: options["top_p"] = gen_conf["top_p"] if "presence_penalty" in gen_conf: options["presence_penalty"] = gen_conf["presence_penalty"] if "frequency_penalty" in gen_conf: options["frequency_penalty"] = gen_conf["frequency_penalty"] # Make API call with dynamic context size response = self.client.chat( model=self.model_name, messages=history, options=options, keep_alive=60 ) return response["message"]["content"].strip(), response.get("eval_count", 0) + response.get("prompt_eval_count", 0) except Exception as e: return "ERROR: " + str(e), 0 ``` ## Benefits 1. Improved Performance: Uses appropriate context windows based on conversation length 2. Better Resource Utilization: Context window size scales with content 3. Maintained Compatibility: Works with existing business logic 4. Predictable Scaling: Context growth in 8192-token increments 5. Smart Updates: Context size updates are optimized to reduce unnecessary model reloads ## Future Considerations 1. Fine-tune buffer ratio based on usage patterns 2. Add monitoring for context window utilization 3. Consider language-specific token counting optimizations 4. Implement adaptive threshold based on conversation patterns 5. Add metrics for context size update frequency --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-03-28 12:38:27 +08:00
Kevin Hu	1fbc4870f0	Fix: HTTP API delete_chunks issue. (#6621 ) ### What problem does this PR solve? #6611 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-28 12:13:43 +08:00
AdySec	f304492716	Fix: binlog_expire_logs_seconds (#6626 ) This PR updates the MySQL container configuration by setting the parameter --binlog_expire_logs_seconds to 604800 seconds (7 days). This change ensures that MySQL automatically purges binary logs older than 7 days, helping to conserve disk space and maintain precise log management. ### What problem does this PR solve? _Briefly describe what this PR aims to solve. Include background context that will help reviewers understand the purpose of the PR._ ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [ ] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe):	2025-03-28 11:37:53 +08:00
balibabu	f35c226ce7	Feat: Add RadioGroup component #3221 (#6622 ) ### What problem does this PR solve? Feat: Add RadioGroup component #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-03-28 10:20:49 +08:00
donblack01	0b48a2e0d1	Fix: When Excel is a formula, the parsed result is a formula, but cannot be correctly parsed as a value type (#6613 ) ### What problem does this PR solve? Fix: When Excel is a formula, the parsed result is a formula, but cannot be correctly parsed as a value type ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) Co-authored-by: tangyu <1@1.com>	2025-03-28 09:33:49 +08:00
liu an	fd614a7aef	Test: Added test cases for Delete Chunks HTTP API (#6612 ) ### What problem does this PR solve? _Briefly describe what this PR aims to solve. Include background context that will help reviewers understand the purpose of the PR._ ### Type of change - [x] add test cases	2025-03-28 09:33:23 +08:00
Kevin Hu	0758c04941	Refa: token similarity calculations. (#6614 ) ### What problem does this PR solve? #6507 ### Type of change - [x] Performance Improvement	2025-03-28 09:33:08 +08:00
Zhichang Yu	fe0396bbb9	Introduced delete_knowledge_graph (#6605 ) ### What problem does this PR solve? Introduced delete_knowledge_graph ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [ ] Documentation Update	2025-03-27 17:16:48 +08:00
Xc1995	974a467cf6	Fix: The rule of Categorize operator is adjusted. (#6599 ) ### What problem does this PR solve? When I use the categorization operator, I find that if the keyword I want to Categorize appears repeatedly in the input, then I cannot judge the word that appears most frequently. Instead, I simply get the word that matches and return all the ones that have made the following changes to the categorize filter. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality) - [x] Refactoring - [x] Performance Improvement	2025-03-27 17:02:21 +08:00
Zhichang Yu	36b62e0fab	EntityResolution batch. Close #6570 (#6602 ) ### What problem does this PR solve? EntityResolution batch ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-27 16:40:36 +08:00
Kevin Hu	d2043ff9f2	Fix: LmStudioChat issue. (#6591 ) ### What problem does this PR solve? #6577 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-27 14:59:15 +08:00
Kevin Hu	ecc9605a32	Fix: team doc deletion issue. (#6589 ) ### What problem does this PR solve? #6557 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-27 13:26:38 +08:00
balibabu	70dc56d26b	Feat: Add logo-with-text-white.svg #3221 (#6588 ) ### What problem does this PR solve? Feat: Add logo-with-text-white.svg #3221 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-03-27 12:28:17 +08:00
Zanyatta	82ccbd2cba	fix: Remove unnecessary minio initialization (#6544 ) ### What problem does this PR solve? Prevent applications from failing to start due to calling non-existent or incorrect Minio connection configurations when using file storage outside of Minio ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-03-27 09:54:25 +08:00
Zhichang Yu	c4998d0e09	Rename graphrag task lock (#6576 ) ### What problem does this PR solve? Rename graphrag task lock ### Type of change - [x] Refactoring	2025-03-26 23:48:47 +08:00
Fengbo Yuan	5eabfe3912	Update values.yaml image to infiniflow/infinity:v0.6.0-dev3 issue#5882 (#6568 ) related issue #5882 ### What problem does this PR solve? update helm infinity image version from v0.5.0 image to infiniflow/infinity:v0.6.0-dev3 to solve issue #5882 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [ ] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe):	2025-03-26 21:15:26 +08:00
Yongteng Lei	df3890827d	Refa: change LLM chat output from full to delta (incremental) (#6534 ) ### What problem does this PR solve? Change LLM chat output from full to delta (incremental) ### Type of change - [x] Refactoring	2025-03-26 19:33:14 +08:00
liu an	6599db1e99	Test: Update test cases for PR #6405 #6504 #6538 (#6565 ) ### What problem does this PR solve? PR #6405 #6504 #6538 ### Type of change - [x] update test cases	2025-03-26 19:23:13 +08:00
writinwaters	b7d7ad536a	AI search vs. chat (#6569 ) ### What problem does this PR solve? ### Type of change - [x] Documentation Update	2025-03-26 18:46:34 +08:00
andy	24d8ff7425	Fix:flow DB Assistant module translate to zh (#6562 ) ### What problem does this PR solve? Fix:flow DB Assistant module translate to zh ### Type of change - [ ] Bug Fix (non-breaking change which fixes an issue) - [ ] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [x] Refactoring - [ ] Performance Improvement - [ ] Other (please describe):	2025-03-26 17:32:05 +08:00
Chenzy	735d9dd949	Feat: add "tools" to llm_factories.json (#6552 ) ### What problem does this PR solve? ### Type of change - [x] New Feature (non-breaking change which adds functionality) --------- Co-authored-by: Chenzy <chenzy901@gmail.com>	2025-03-26 17:31:18 +08:00
zstar	cc5f4a5efa	Fix: python_api_reference.md update dataset bug (#6527 ) ### What problem does this PR solve? There is a small bug in the update dataset of this document. The return type of rag_oobject.list_datasets is a list type, and the first item should be taken as' ragflow_stdk.modules.dataset ' DataSet`， Adapt to the update. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-26 17:30:09 +08:00
liu an	93c26ae1ef	Test: Added test cases for Update Chunk HTTP API (#6556 ) ### What problem does this PR solve? cover [update chunk](https://ragflow.io/docs/v0.17.2/http_api_reference#update-chunk) endpoints ### Type of change - [x] add test cases	2025-03-26 16:47:47 +08:00
Kevin Hu	cc8029a732	Fix: uploading in chat box issue. (#6547 ) ### What problem does this PR solve? #6228 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-26 15:37:48 +08:00
Zhichang Yu	6bf26e2a81	Optimize graphrag again (#6513 ) ### What problem does this PR solve? Removed set_entity and set_relation to avoid accessing doc engine during graph computation. Introduced GraphChange to avoid writing unchanged chunks. ### Type of change - [x] Performance Improvement	2025-03-26 15:34:42 +08:00
Kevin Hu	7a677cb095	Fix: image_id is None. (#6538 ) ### What problem does this PR solve? #6499 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-26 12:04:21 +08:00
Kevin Hu	12ad746ee6	Fix: Bedrock model invocation error. (#6533 ) ### What problem does this PR solve? ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-26 11:27:12 +08:00
Kevin Hu	163e71d06f	Fix: Hunyuan model adding error. (#6531 ) ### What problem does this PR solve? #6523 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-26 10:33:33 +08:00
Kevin Hu	c8c91fd827	Fix: link to KB from filemanager. (#6530 ) ### What problem does this PR solve? ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-26 09:41:14 +08:00
writinwaters	d17970ebd0	0321 chunkmethods (#6520 ) ### What problem does this PR solve? #6061 ### Type of change - [x] Documentation Update	2025-03-26 09:03:18 +08:00
Kevin Hu	bf483fdf02	Fix: describe parameter error. (#6519 ) ### What problem does this PR solve? #6228 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-26 09:02:48 +08:00
Kevin Hu	b2b7ed8927	Fix: abnormal chunk id (#6506 ) ### What problem does this PR solve? #6500 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-25 19:03:29 +08:00
liu an	0a79dfd5cf	Test: Added test cases for List Chunks HTTP API (#6514 ) ### What problem does this PR solve? cover [list chunks](https://ragflow.io/docs/v0.17.2/http_api_reference#list-chunks) endpoints ### Type of change - [x] update test cases	2025-03-25 17:28:58 +08:00
Stephen Hu	1d73baf3d8	Feat: improve '/mv' '/list' API performance (#6502 ) ### What problem does this PR solve? 1. for /mv API use get by ids to avoid O(n) DB IO 2. for /list remove one useless call ### Type of change - [x] Performance Improvement	2025-03-25 16:30:25 +08:00
Kevin Hu	f3ae4a3bae	Fix: img_id errror. (#6504 ) ### What problem does this PR solve? #6499 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-25 15:57:03 +08:00
liwenju0	814a210f5d	Fix: failed to acquire lock exception with retry mechanism for postgres and mysql (#6483 ) Added the with_retry decorator in db_models.py to add a retry mechanism for database operations. Applied the retry mechanism to the lock and unlock methods of the PostgresDatabaseLock and MysqlDatabaseLock classes to enhance the reliability of lock operations. ### What problem does this PR solve? resolve failed to acquire lock exception with retry mechanism ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: wenju.li <wenju.li@deepctr.cn>	2025-03-25 15:09:56 +08:00
Kevin Hu	60c3a253ad	Fix: api-key issue for xinference. (#6490 ) ### What problem does this PR solve? #2792 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-25 15:01:13 +08:00
Kevin Hu	384b6549a6	Fix: remove doc status checking while creating an assistant. (#6486 ) ### What problem does this PR solve? #6461 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-25 11:13:22 +08:00
科幻大脑	b2ec39c59d	Fix: Resolve FlowSetting not reading Title from .ts files (#6469 ) ### What problem does this PR solve? Fix: Resolve FlowSetting not reading Title from .ts files ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-25 11:07:29 +08:00
Kevin Hu	095fc84cf2	Fix: claude max tokens. (#6484 ) ### What problem does this PR solve? #6458 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-25 10:41:55 +08:00
Yongteng Lei	542cf16292	Feat: add project_id and project_name to Langfuse API (#6481 ) ### What problem does this PR solve? Enhance Langfuse API: add project_id and project_name ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-03-25 10:36:34 +08:00
liu an	27989eb9a5	Test: Add list chunk checkpoint for the add chunk API (#6482 ) ### What problem does this PR solve? Add list chunk checkpoint for the add chunk API ### Type of change - [x] update test cases	2025-03-25 10:36:21 +08:00
Graf2242	05997e8215	Remove thinking block from keyword node's result (#6474 ) ### What problem does this PR solve? For now, if you use thinking model (deepseek-r1:32b with ollama server in my case) in "Keyword" node, result contains all <think> block and so node return not only keywords ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [ ] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe):	2025-03-25 10:22:41 +08:00
Stephen Hu	5d9afce12d	Feat: improve the performance for '/upload' API (#6479 ) ### What problem does this PR solve? improve the logic to fetch parent folder, remove the useless DB IO logic ### Type of change - [x] Performance Improvement	2025-03-25 10:22:19 +08:00
Yongteng Lei	ee6a0bd9db	Refa: enhancement: enhance the prompt of related_question API (#6463 ) ### What problem does this PR solve? Enhance the prompt of `related_question` API. ### Type of change - [x] Enhancement - [x] Documentation Update	2025-03-25 10:00:10 +08:00
liu an	b6f3242c6c	Test: Update test cases to reduce execution time (#6470 ) ### What problem does this PR solve? _Briefly describe what this PR aims to solve. Include background context that will help reviewers understand the purpose of the PR._ ### Type of change - [x] update test cases	2025-03-25 09:17:05 +08:00
utopia2077	390086c6ab	Fix: split process bug in graphrag extract (#6423 ) ### What problem does this PR solve? 1. miss completion delimiter. 2. miss bracket process. 3. doc_ids return by update_graph is a set, and insert operation in extract_community need a list. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-24 21:41:20 +08:00
writinwaters	a40c5aea83	Miscellaneous UI updates (#6471 ) ### What problem does this PR solve? ### Type of change - [x] Documentation Update	2025-03-24 19:36:47 +08:00
Stephen Hu	f691b4ddd2	Feat: Improve "/convert" API's performance (#6465 ) ### What problem does this PR solve? for batch requests based on get_by_ids to fetch all files first replace the O(n) IO logic. ### Type of change - [x] Performance Improvement	2025-03-24 19:08:22 +08:00
balibabu	3c57a9986c	Feat: Add LangfuseCard component. #6155 (#6468 ) ### What problem does this PR solve? Feat: Add LangfuseCard component. #6155 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-03-24 19:07:55 +08:00

1 2 3 4 5 ...

2689 Commits