ragflow

AI/ragflow

mirror of https://git.mirrors.martin98.com/https://github.com/infiniflow/ragflow.git synced 2025-07-05 23:25:09 +08:00

Author	SHA1	Message	Date
Yongteng Lei	a008b38cf5	Fix: local variable referenced before assignment (#6909 ) ### What problem does this PR solve? Fix: local variable referenced before assignment. #6803 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-04-09 20:29:12 +08:00
Yongteng Lei	dc2c74b249	Feat: add primitive support for function calls (#6840 ) ### What problem does this PR solve? This PR introduces primitive support for function calls, enabling the system to handle basic function call capabilities. However, this feature is currently experimental and not yet enabled for general use, as it is only supported by a subset of models, namely, Qwen and OpenAI models. ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-04-08 16:09:03 +08:00
Zhichang Yu	e7a2a4b7ff	Log llm response on exception (#6750 ) ### What problem does this PR solve? Log llm response on exception ### Type of change - [x] Refactoring	2025-04-02 17:10:57 +08:00
Alex Chen	46b5e32cd7	Feat: support vision llm for gpustack (#6636 ) ### What problem does this PR solve? https://github.com/infiniflow/ragflow/issues/6138 This PR is going to support vision llm for gpustack, modify url path from `/v1-openai` to `/v1` ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-03-31 15:33:52 +08:00
Marcus Yuan	c61df5dd25	Dynamic Context Window Size for Ollama Chat (#6582 ) # Dynamic Context Window Size for Ollama Chat ## Problem Statement Previously, the Ollama chat implementation used a fixed context window size of 32768 tokens. This caused two main issues: 1. Performance degradation due to unnecessarily large context windows for small conversations 2. Potential business logic failures when using smaller fixed sizes (e.g., 2048 tokens) ## Solution Implemented a dynamic context window size calculation that: 1. Uses a base context size of 8192 tokens 2. Applies a 1.2x buffer ratio to the total token count 3. Adds multiples of 8192 tokens based on the buffered token count 4. Implements a smart context size update strategy ## Implementation Details ### Token Counting Logic ```python def count_tokens(text): """Calculate token count for text""" # Simple calculation: 1 token per ASCII character # 2 tokens for non-ASCII characters (Chinese, Japanese, Korean, etc.) total = 0 for char in text: if ord(char) < 128: # ASCII characters total += 1 else: # Non-ASCII characters total += 2 return total ``` ### Dynamic Context Calculation ```python def _calculate_dynamic_ctx(self, history): """Calculate dynamic context window size""" # Calculate total tokens for all messages total_tokens = 0 for message in history: content = message.get("content", "") content_tokens = count_tokens(content) role_tokens = 4 # Role marker token overhead total_tokens += content_tokens + role_tokens # Apply 1.2x buffer ratio total_tokens_with_buffer = int(total_tokens * 1.2) # Calculate context size in multiples of 8192 if total_tokens_with_buffer <= 8192: ctx_size = 8192 else: ctx_multiplier = (total_tokens_with_buffer // 8192) + 1 ctx_size = ctx_multiplier * 8192 return ctx_size ``` ### Integration in Chat Method ```python def chat(self, system, history, gen_conf): if system: history.insert(0, {"role": "system", "content": system}) if "max_tokens" in gen_conf: del gen_conf["max_tokens"] try: # Calculate new context size new_ctx_size = self._calculate_dynamic_ctx(history) # Prepare options with context size options = { "num_ctx": new_ctx_size } # Add other generation options if "temperature" in gen_conf: options["temperature"] = gen_conf["temperature"] if "max_tokens" in gen_conf: options["num_predict"] = gen_conf["max_tokens"] if "top_p" in gen_conf: options["top_p"] = gen_conf["top_p"] if "presence_penalty" in gen_conf: options["presence_penalty"] = gen_conf["presence_penalty"] if "frequency_penalty" in gen_conf: options["frequency_penalty"] = gen_conf["frequency_penalty"] # Make API call with dynamic context size response = self.client.chat( model=self.model_name, messages=history, options=options, keep_alive=60 ) return response["message"]["content"].strip(), response.get("eval_count", 0) + response.get("prompt_eval_count", 0) except Exception as e: return "ERROR: " + str(e), 0 ``` ## Benefits 1. Improved Performance: Uses appropriate context windows based on conversation length 2. Better Resource Utilization: Context window size scales with content 3. Maintained Compatibility: Works with existing business logic 4. Predictable Scaling: Context growth in 8192-token increments 5. Smart Updates: Context size updates are optimized to reduce unnecessary model reloads ## Future Considerations 1. Fine-tune buffer ratio based on usage patterns 2. Add monitoring for context window utilization 3. Consider language-specific token counting optimizations 4. Implement adaptive threshold based on conversation patterns 5. Add metrics for context size update frequency --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-03-28 12:38:27 +08:00
Kevin Hu	d2043ff9f2	Fix: LmStudioChat issue. (#6591 ) ### What problem does this PR solve? #6577 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-27 14:59:15 +08:00
Yongteng Lei	df3890827d	Refa: change LLM chat output from full to delta (incremental) (#6534 ) ### What problem does this PR solve? Change LLM chat output from full to delta (incremental) ### Type of change - [x] Refactoring	2025-03-26 19:33:14 +08:00
Kevin Hu	12ad746ee6	Fix: Bedrock model invocation error. (#6533 ) ### What problem does this PR solve? ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-26 11:27:12 +08:00
Kevin Hu	095fc84cf2	Fix: claude max tokens. (#6484 ) ### What problem does this PR solve? #6458 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-25 10:41:55 +08:00
Kevin Hu	85eb3775d6	Refa: update Anthropic models. (#6445 ) ### What problem does this PR solve? #6421 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-24 12:34:57 +08:00
fansir	efc4796f01	Fix ratelimit errors during document parsing (#6413 ) ### What problem does this PR solve? When using the online large model API knowledge base to extract knowledge graphs, frequent Rate Limit Errors were triggered, causing document parsing to fail. This commit fixes the issue by optimizing API calls in the following way: Added exponential backoff and jitter to the API call to reduce the frequency of Rate Limit Errors. ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [ ] New Feature (non-breaking change which adds functionality) - [ ] Documentation Update - [ ] Refactoring - [ ] Performance Improvement - [ ] Other (please describe):	2025-03-22 23:07:03 +08:00
Kevin Hu	a2a4bfe3e3	Fix: change ollama default num_ctx. (#6395 ) ### What problem does this PR solve? #6163 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-21 16:22:03 +08:00
Kevin Hu	e9a6675c40	Fix: enable ollama api-key. (#6205 ) ### What problem does this PR solve? #6189 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-18 13:37:34 +08:00
Kevin Hu	7e4d693054	Fix: in case response.choices[0].message.content is None. (#6190 ) ### What problem does this PR solve? #6164 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-03-18 10:00:27 +08:00
writinwaters	9c8060f619	0.17.1 release notes (#6021 ) ### What problem does this PR solve? ### Type of change - [x] Documentation Update	2025-03-13 14:43:24 +08:00
Kevin Hu	3571270191	Refa: refine the context window size warning. (#5993 ) ### What problem does this PR solve? ### Type of change - [x] Refactoring	2025-03-12 19:40:54 +08:00
kuro5989	6e13922bdc	Feat: Add qwq model support to Tongyi-Qianwen factory (#5981 ) ### What problem does this PR solve? add qwq model support to Tongyi-Qianwen factory https://github.com/infiniflow/ragflow/issues/5869 ### Type of change - [x] New Feature (non-breaking change which adds functionality) ![image](https://github.com/user-attachments/assets/49f5c6a0-ecaf-41dd-a23a-2009f854d62c) ![image](https://github.com/user-attachments/assets/93ffa303-920e-4942-8188-bcd6b7209204) ![1741774779438](https://github.com/user-attachments/assets/25f2fd1d-8640-4df0-9a08-78ee9daaa8fe) ![image](https://github.com/user-attachments/assets/4763cf6c-1f76-43c4-80ee-74dfd666a184) Co-authored-by: zhaozhicheng <zhicheng.zhao@fastonetech.com>	2025-03-12 18:54:15 +08:00
Kevin Hu	251ba7f058	Refa: remove max tokens since no one needs it. (#5690 ) ### What problem does this PR solve? #5646 #5640 ### Type of change - [x] Refactoring	2025-03-06 11:29:40 +08:00
Kevin Hu	955801db2e	Resolve super class invokation error. (#5337 ) ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-02-25 17:42:29 +08:00
Kevin Hu	daddfc9e1b	Remove dup gb2312, solve currupt error. (#5326 ) ### What problem does this PR solve? #5252 #5325 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-02-25 12:22:37 +08:00
Kevin Hu	df3d0f61bd	Fix base url missing for deepseek from Tongyi. (#5294 ) ### What problem does this PR solve? ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-02-24 15:43:32 +08:00
Kevin Hu	ec96426c00	Tongyi adapts deepseek. (#5285 ) ### What problem does this PR solve? ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-02-24 14:04:25 +08:00
Omar Leonardo Sanchez Granados	4f2816c01c	Add support to boto3 default connection (#5246 ) ### What problem does this PR solve? This pull request includes changes to the initialization logic of the `ChatModel` and `EmbeddingModel` classes to enhance the handling of AWS credentials. Use cases: - Use env variables for credentials instead of managing them on the DB - Easy connection when deploying on an AWS machine ### Type of change - [X] New Feature (non-breaking change which adds functionality)	2025-02-24 11:01:14 +08:00
yrk111222	7ce675030b	Support downloading models from ModelScope Community. (#5073 ) This PR supports downloading models from ModelScope. The main modifications are as follows: -New Feature (non-breaking change which adds functionality) -Documentation Update --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2025-02-24 10:12:20 +08:00
Kevin Hu	1a755e75c5	Remove v1 (#5220 ) ### What problem does this PR solve? #5201 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-02-21 15:15:38 +08:00
saikidev	d2929e432e	Feat: add LLM provider PPIO (#5013 ) ### What problem does this PR solve? Add a LLM provider: PPIO ### Type of change - [x] New Feature (non-breaking change which adds functionality) - [x] Documentation Update	2025-02-17 12:03:26 +08:00
Kevin Hu	b08bb56f6c	Display thinking for deepseek r1 (#4904 ) ### What problem does this PR solve? #4903 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-02-12 15:43:13 +08:00
Kevin Hu	2aa0cdde8f	Fix Gemini chat issue. (#4757 ) ### What problem does this PR solve? #4753 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-02-07 12:00:19 +08:00
Kyle	036f37a627	fix: err object has no attribute 'iter_lines' (#4686 ) ### What problem does this PR solve? ERROR: 'Stream' object has no attribute 'iter_lines' with reference to Claude/Anthropic chat streams ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) Co-authored-by: Kyle Olmstead <k.olmstead@offensive-security.com>	2025-02-01 22:39:30 +08:00
Kevin Hu	4776fa5e4e	Refactor for total_tokens. (#4652 ) ### What problem does this PR solve? #4567 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2025-01-26 13:54:26 +08:00
Kevin Hu	dd0ebbea35	Light GraphRAG (#4585 ) ### What problem does this PR solve? #4543 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2025-01-22 19:43:14 +08:00
Alex Chen	7944aacafa	Feat: add gpustack model provider (#4469 ) ### What problem does this PR solve? Add GPUStack as a new model provider. [GPUStack](https://github.com/gpustack/gpustack) is an open-source GPU cluster manager for running LLMs. Currently, locally deployed models in GPUStack cannot integrate well with RAGFlow. GPUStack provides both OpenAI compatible APIs (Models / Chat Completions / Embeddings / Speech2Text / TTS) and other APIs like Rerank. We would like to use GPUStack as a model provider in ragflow. [GPUStack Docs](https://docs.gpustack.ai/latest/quickstart/) Related issue: https://github.com/infiniflow/ragflow/issues/4064. ### Type of change - [x] New Feature (non-breaking change which adds functionality) ### Testing Instructions 1. Install GPUStack and deploy the `llama-3.2-1b-instruct` llm, `bge-m3` text embedding model, `bge-reranker-v2-m3` rerank model, `faster-whisper-medium` Speech-to-Text model, `cosyvoice-300m-sft` in GPUStack. 2. Add provider in ragflow settings. 3. Testing in ragflow.	2025-01-15 14:15:58 +08:00
Yingfeng	50f209204e	Synchronize with enterprise version (#4325 ) ### Type of change - [x] Refactoring	2025-01-02 13:44:44 +08:00
Zhichang Yu	0d68a6cd1b	Fix errors detected by Ruff (#3918 ) ### What problem does this PR solve? Fix errors detected by Ruff ### Type of change - [x] Refactoring	2024-12-08 14:21:12 +08:00
Kevin Hu	593ffc4067	Fix HuggingFace model error. (#3870 ) ### What problem does this PR solve? #3865 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2024-12-05 13:28:42 +08:00
Jin Hai	6657ca7cde	Change default error message to English (#3838 ) ### What problem does this PR solve? As title ### Type of change - [x] Refactoring --------- Signed-off-by: Jin Hai <haijin.chn@gmail.com>	2024-12-04 09:34:49 +08:00
Zhichang Yu	d94386e00a	Pass top_p to ollama (#3744 ) ### What problem does this PR solve? Pass top_p to ollama. Close #1769 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2024-11-29 14:52:27 +08:00
Kevin Hu	0891a393d7	Let ThreadPool exit gracefully. (#3653 ) ### What problem does this PR solve? #3646 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2024-11-26 16:31:07 +08:00
Kevin Hu	81c7b6afc5	Make spark model robuster to model name (#3514 ) ### What problem does this PR solve? ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2024-11-20 20:53:44 +08:00
shijiefengjun	632b23486f	Fix the value issue of anthropic (#3351 ) ### What problem does this PR solve? This pull request fixes the issue mentioned in https://github.com/infiniflow/ragflow/issues/3263. 1. response should be parsed as dict, prevent the following code from failing to take values: ans = response["content"][0]["text"] 2. API Model ```claude-instant-1.2``` has retired (by [model-deprecations](https://docs.anthropic.com/en/docs/resources/model-deprecations)), it will trigger errors in the code, so I deleted it from the conf/llm_factories.json file and updated the latest API Model ```claude-3-5-sonnet-20241022``` ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: chenhaodong <chenhaodong@ctrlvideo.com> Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2024-11-13 16:13:52 +08:00
Kevin Hu	34d1daac67	fix: Anthropic param error (#3327 ) ### What problem does this PR solve? #3263 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2024-11-11 11:54:14 +08:00
Kevin Hu	7e0148c058	fix local variable ans (#3077 ) ### What problem does this PR solve? #3064 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2024-10-29 10:42:45 +08:00
Kevin Hu	f86826b7a0	refactor error message of qwen (#3074 ) ### What problem does this PR solve? #3055 ### Type of change - [x] Refactoring	2024-10-29 10:08:08 +08:00
Kevin Hu	9457d20ef1	make gemini robust (#3012 ) ### What problem does this PR solve? #3003 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2024-10-25 10:50:44 +08:00
Yinquan WANG	445dce4363	[Bug]: unnecessary auto-increment calculations in the tokens statistics of the chat model (#2969 ) ### What problem does this PR solve? the details is shown in https://github.com/infiniflow/ragflow/issues/2968 ### Type of change - [X] Bug Fix (non-breaking change which fixes an issue) --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2024-10-22 16:26:04 +08:00
Yinquan WANG	5aa9d7787e	[Bug]: When use OpenAI chat model , raise ERROR: 'CompletionUsage' object has no attribute 'get' #2948 (#2949 ) [Bug]: When use OpenAI chat model , raise ERROR: 'CompletionUsage' object has no attribute 'get' #2948 ### What problem does this PR solve? the detail of this PR is shown at https://github.com/infiniflow/ragflow/issues/2948 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2024-10-22 11:40:05 +08:00
Kevin Hu	b2524eec49	fix sequence2txt error and usage total token issue (#2961 ) ### What problem does this PR solve? #1363 ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue)	2024-10-22 11:38:37 +08:00
chongchuanbing	ac26d09a59	Feature/feat1017 (#2872 ) ### What problem does this PR solve? 1. fix: mid map show error in knowledge graph, juse because ```@antv/g6```version changed 2. feat: concurrent threads configuration support in graph extractor 3. fix: used tokens update failed for tenant 4. feat: timeout configuration support for llm 5. fix: regex error in graph extractor 6. feat: qwen rerank(```gte-rerank```) support 7. fix: timeout deal in knowledge graph index process. Now chat by stream output, also, it is configuratable. 8. feat: ```qwen-long``` model configuration ### Type of change - [x] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality) --------- Co-authored-by: chongchuanbing <chongchuanbing@gmail.com> Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2024-10-21 12:11:08 +08:00
JobSmithManipulation	3f065c75da	support chat model in huggingface (#2802 ) ### What problem does this PR solve? #2794 ### Type of change - [x] New Feature (non-breaking change which adds functionality)	2024-10-11 14:45:48 +08:00
JobSmithManipulation	18f80743eb	support api-version and change default-model in adding azure-openai and openai (#2799 ) ### What problem does this PR solve? #2701 #2712 #2749 ### Type of change -[x] Bug Fix (non-breaking change which fixes an issue) - [x] New Feature (non-breaking change which adds functionality) --------- Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>	2024-10-11 11:26:42 +08:00

1 2 3

144 Commits