# Dynamic Context Window Size for Ollama Chat
## Problem Statement
Previously, the Ollama chat implementation used a fixed context window
size of 32768 tokens. This caused two main issues:
1. Performance degradation due to unnecessarily large context windows
for small conversations
2. Potential business logic failures when using smaller fixed sizes
(e.g., 2048 tokens)
## Solution
Implemented a dynamic context window size calculation that:
1. Uses a base context size of 8192 tokens
2. Applies a 1.2x buffer ratio to the total token count
3. Adds multiples of 8192 tokens based on the buffered token count
4. Implements a smart context size update strategy
## Implementation Details
### Token Counting Logic
```python
def count_tokens(text):
"""Calculate token count for text"""
# Simple calculation: 1 token per ASCII character
# 2 tokens for non-ASCII characters (Chinese, Japanese, Korean, etc.)
total = 0
for char in text:
if ord(char) < 128: # ASCII characters
total += 1
else: # Non-ASCII characters
total += 2
return total
```
### Dynamic Context Calculation
```python
def _calculate_dynamic_ctx(self, history):
"""Calculate dynamic context window size"""
# Calculate total tokens for all messages
total_tokens = 0
for message in history:
content = message.get("content", "")
content_tokens = count_tokens(content)
role_tokens = 4 # Role marker token overhead
total_tokens += content_tokens + role_tokens
# Apply 1.2x buffer ratio
total_tokens_with_buffer = int(total_tokens * 1.2)
# Calculate context size in multiples of 8192
if total_tokens_with_buffer <= 8192:
ctx_size = 8192
else:
ctx_multiplier = (total_tokens_with_buffer // 8192) + 1
ctx_size = ctx_multiplier * 8192
return ctx_size
```
### Integration in Chat Method
```python
def chat(self, system, history, gen_conf):
if system:
history.insert(0, {"role": "system", "content": system})
if "max_tokens" in gen_conf:
del gen_conf["max_tokens"]
try:
# Calculate new context size
new_ctx_size = self._calculate_dynamic_ctx(history)
# Prepare options with context size
options = {
"num_ctx": new_ctx_size
}
# Add other generation options
if "temperature" in gen_conf:
options["temperature"] = gen_conf["temperature"]
if "max_tokens" in gen_conf:
options["num_predict"] = gen_conf["max_tokens"]
if "top_p" in gen_conf:
options["top_p"] = gen_conf["top_p"]
if "presence_penalty" in gen_conf:
options["presence_penalty"] = gen_conf["presence_penalty"]
if "frequency_penalty" in gen_conf:
options["frequency_penalty"] = gen_conf["frequency_penalty"]
# Make API call with dynamic context size
response = self.client.chat(
model=self.model_name,
messages=history,
options=options,
keep_alive=60
)
return response["message"]["content"].strip(), response.get("eval_count", 0) + response.get("prompt_eval_count", 0)
except Exception as e:
return "**ERROR**: " + str(e), 0
```
## Benefits
1. **Improved Performance**: Uses appropriate context windows based on
conversation length
2. **Better Resource Utilization**: Context window size scales with
content
3. **Maintained Compatibility**: Works with existing business logic
4. **Predictable Scaling**: Context growth in 8192-token increments
5. **Smart Updates**: Context size updates are optimized to reduce
unnecessary model reloads
## Future Considerations
1. Fine-tune buffer ratio based on usage patterns
2. Add monitoring for context window utilization
3. Consider language-specific token counting optimizations
4. Implement adaptive threshold based on conversation patterns
5. Add metrics for context size update frequency
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
This PR updates the MySQL container configuration by setting the
parameter --binlog_expire_logs_seconds to 604800 seconds (7 days). This
change ensures that MySQL automatically purges binary logs older than 7
days, helping to conserve disk space and maintain precise log
management.
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Feat: Add RadioGroup component #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Fix: When Excel is a formula, the parsed result is a formula, but cannot
be correctly parsed as a value type
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: tangyu <1@1.com>
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] add test cases
### What problem does this PR solve?
Introduced delete_knowledge_graph
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] Documentation Update
### What problem does this PR solve?
When I use the categorization operator, I find that if the keyword I
want to Categorize appears repeatedly in the input, then I cannot judge
the word that appears most frequently. Instead, I simply get the word
that matches and return all the ones that have made the following
changes to the categorize filter.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
- [x] Performance Improvement
### What problem does this PR solve?
Feat: Add logo-with-text-white.svg #3221
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Prevent applications from failing to start due to calling non-existent
or incorrect Minio connection configurations when using file storage
outside of Minio
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
related issue #5882
### What problem does this PR solve?
update helm infinity image version from v0.5.0
image to infiniflow/infinity:v0.6.0-dev3
to solve issue #5882
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Fix:flow DB Assistant module translate to zh
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [x] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: Chenzy <chenzy901@gmail.com>
### What problem does this PR solve?
There is a small bug in the update dataset of this document. The return
type of rag_oobject.list_datasets is a list type, and the first item
should be taken as' ragflow_stdk.modules.dataset ' DataSet`, Adapt to
the update.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Removed set_entity and set_relation to avoid accessing doc engine during
graph computation.
Introduced GraphChange to avoid writing unchanged chunks.
### Type of change
- [x] Performance Improvement
### What problem does this PR solve?
1. for /mv API use get by ids to avoid O(n) DB IO
2. for /list remove one useless call
### Type of change
- [x] Performance Improvement
Added the with_retry decorator in db_models.py to add a retry mechanism
for database operations. Applied the retry mechanism to the lock and
unlock methods of the PostgresDatabaseLock and MysqlDatabaseLock classes
to enhance the reliability of lock operations.
### What problem does this PR solve?
resolve failed to acquire lock exception with retry mechanism
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: wenju.li <wenju.li@deepctr.cn>
### What problem does this PR solve?
Fix: Resolve FlowSetting not reading Title from .ts files
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Enhance Langfuse API: add project_id and project_name
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
For now, if you use thinking model (deepseek-r1:32b with ollama server
in my case) in "Keyword" node, result contains all <think> block and so
node return not only keywords
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
improve the logic to fetch parent folder, remove the useless DB IO logic
### Type of change
- [x] Performance Improvement
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] update test cases
### What problem does this PR solve?
1. miss completion delimiter.
2. miss bracket process.
3. doc_ids return by update_graph is a set, and insert operation in
extract_community need a list.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
for batch requests based on get_by_ids to fetch all files first replace
the O(n) IO logic.
### Type of change
- [x] Performance Improvement
### What problem does this PR solve?
Feat: Add LangfuseCard component. #6155
### Type of change
- [x] New Feature (non-breaking change which adds functionality)