2689 Commits

Author SHA1 Message Date
Marcus Yuan
c61df5dd25
Dynamic Context Window Size for Ollama Chat (#6582)
# Dynamic Context Window Size for Ollama Chat

## Problem Statement
Previously, the Ollama chat implementation used a fixed context window
size of 32768 tokens. This caused two main issues:
1. Performance degradation due to unnecessarily large context windows
for small conversations
2. Potential business logic failures when using smaller fixed sizes
(e.g., 2048 tokens)

## Solution
Implemented a dynamic context window size calculation that:
1. Uses a base context size of 8192 tokens
2. Applies a 1.2x buffer ratio to the total token count
3. Adds multiples of 8192 tokens based on the buffered token count
4. Implements a smart context size update strategy

## Implementation Details

### Token Counting Logic
```python
def count_tokens(text):
    """Calculate token count for text"""
    # Simple calculation: 1 token per ASCII character
    # 2 tokens for non-ASCII characters (Chinese, Japanese, Korean, etc.)
    total = 0
    for char in text:
        if ord(char) < 128:  # ASCII characters
            total += 1
        else:  # Non-ASCII characters
            total += 2
    return total
```

### Dynamic Context Calculation
```python
def _calculate_dynamic_ctx(self, history):
    """Calculate dynamic context window size"""
    # Calculate total tokens for all messages
    total_tokens = 0
    for message in history:
        content = message.get("content", "")
        content_tokens = count_tokens(content)
        role_tokens = 4  # Role marker token overhead
        total_tokens += content_tokens + role_tokens

    # Apply 1.2x buffer ratio
    total_tokens_with_buffer = int(total_tokens * 1.2)
    
    # Calculate context size in multiples of 8192
    if total_tokens_with_buffer <= 8192:
        ctx_size = 8192
    else:
        ctx_multiplier = (total_tokens_with_buffer // 8192) + 1
        ctx_size = ctx_multiplier * 8192
    
    return ctx_size
```

### Integration in Chat Method
```python
def chat(self, system, history, gen_conf):
    if system:
        history.insert(0, {"role": "system", "content": system})
    if "max_tokens" in gen_conf:
        del gen_conf["max_tokens"]
    try:
        # Calculate new context size
        new_ctx_size = self._calculate_dynamic_ctx(history)
        
        # Prepare options with context size
        options = {
            "num_ctx": new_ctx_size
        }
        # Add other generation options
        if "temperature" in gen_conf:
            options["temperature"] = gen_conf["temperature"]
        if "max_tokens" in gen_conf:
            options["num_predict"] = gen_conf["max_tokens"]
        if "top_p" in gen_conf:
            options["top_p"] = gen_conf["top_p"]
        if "presence_penalty" in gen_conf:
            options["presence_penalty"] = gen_conf["presence_penalty"]
        if "frequency_penalty" in gen_conf:
            options["frequency_penalty"] = gen_conf["frequency_penalty"]
            
        # Make API call with dynamic context size
        response = self.client.chat(
            model=self.model_name,
            messages=history,
            options=options,
            keep_alive=60
        )
        return response["message"]["content"].strip(), response.get("eval_count", 0) + response.get("prompt_eval_count", 0)
    except Exception as e:
        return "**ERROR**: " + str(e), 0
```

## Benefits
1. **Improved Performance**: Uses appropriate context windows based on
conversation length
2. **Better Resource Utilization**: Context window size scales with
content
3. **Maintained Compatibility**: Works with existing business logic
4. **Predictable Scaling**: Context growth in 8192-token increments
5. **Smart Updates**: Context size updates are optimized to reduce
unnecessary model reloads

## Future Considerations
1. Fine-tune buffer ratio based on usage patterns
2. Add monitoring for context window utilization
3. Consider language-specific token counting optimizations
4. Implement adaptive threshold based on conversation patterns
5. Add metrics for context size update frequency

---------

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-03-28 12:38:27 +08:00
Kevin Hu
1fbc4870f0
Fix: HTTP API delete_chunks issue. (#6621)
### What problem does this PR solve?

#6611

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-28 12:13:43 +08:00
AdySec
f304492716
Fix: binlog_expire_logs_seconds (#6626)
This PR updates the MySQL container configuration by setting the
parameter --binlog_expire_logs_seconds to 604800 seconds (7 days). This
change ensures that MySQL automatically purges binary logs older than 7
days, helping to conserve disk space and maintain precise log
management.

### What problem does this PR solve?

_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2025-03-28 11:37:53 +08:00
balibabu
f35c226ce7
Feat: Add RadioGroup component #3221 (#6622)
### What problem does this PR solve?

Feat: Add RadioGroup component #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-03-28 10:20:49 +08:00
donblack01
0b48a2e0d1
Fix: When Excel is a formula, the parsed result is a formula, but cannot be correctly parsed as a value type (#6613)
### What problem does this PR solve?

Fix: When Excel is a formula, the parsed result is a formula, but cannot
be correctly parsed as a value type

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

Co-authored-by: tangyu <1@1.com>
2025-03-28 09:33:49 +08:00
liu an
fd614a7aef
Test: Added test cases for Delete Chunks HTTP API (#6612)
### What problem does this PR solve?

_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change

- [x] add test cases
2025-03-28 09:33:23 +08:00
Kevin Hu
0758c04941
Refa: token similarity calculations. (#6614)
### What problem does this PR solve?

#6507

### Type of change

- [x] Performance Improvement
2025-03-28 09:33:08 +08:00
Zhichang Yu
fe0396bbb9
Introduced delete_knowledge_graph (#6605)
### What problem does this PR solve?

Introduced delete_knowledge_graph

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] Documentation Update
2025-03-27 17:16:48 +08:00
Xc1995
974a467cf6
Fix: The rule of Categorize operator is adjusted. (#6599)
### What problem does this PR solve?

When I use the categorization operator, I find that if the keyword I
want to Categorize appears repeatedly in the input, then I cannot judge
the word that appears most frequently. Instead, I simply get the word
that matches and return all the ones that have made the following
changes to the categorize filter.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
- [x] Performance Improvement
2025-03-27 17:02:21 +08:00
Zhichang Yu
36b62e0fab
EntityResolution batch. Close #6570 (#6602)
### What problem does this PR solve?

EntityResolution batch

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-27 16:40:36 +08:00
Kevin Hu
d2043ff9f2
Fix: LmStudioChat issue. (#6591)
### What problem does this PR solve?

#6577

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-27 14:59:15 +08:00
Kevin Hu
ecc9605a32
Fix: team doc deletion issue. (#6589)
### What problem does this PR solve?

#6557

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-27 13:26:38 +08:00
balibabu
70dc56d26b Feat: Add logo-with-text-white.svg #3221 (#6588)
### What problem does this PR solve?

Feat: Add logo-with-text-white.svg #3221

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-03-27 12:28:17 +08:00
Zanyatta
82ccbd2cba
fix:  Remove unnecessary minio initialization (#6544)
### What problem does this PR solve?

Prevent applications from failing to start due to calling non-existent
or incorrect Minio connection configurations when using file storage
outside of Minio

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
2025-03-27 09:54:25 +08:00
Zhichang Yu
c4998d0e09
Rename graphrag task lock (#6576)
### What problem does this PR solve?

Rename graphrag task lock

### Type of change

- [x] Refactoring
2025-03-26 23:48:47 +08:00
Fengbo Yuan
5eabfe3912
Update values.yaml image to infiniflow/infinity:v0.6.0-dev3 issue#5882 (#6568)
related issue #5882

### What problem does this PR solve?

update helm infinity image version from v0.5.0 
 image to infiniflow/infinity:v0.6.0-dev3 

to solve issue #5882

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2025-03-26 21:15:26 +08:00
Yongteng Lei
df3890827d
Refa: change LLM chat output from full to delta (incremental) (#6534)
### What problem does this PR solve?

Change LLM chat output from full to delta (incremental)

### Type of change

- [x] Refactoring
2025-03-26 19:33:14 +08:00
liu an
6599db1e99
Test: Update test cases for PR #6405 #6504 #6538 (#6565)
### What problem does this PR solve?

PR #6405 #6504 #6538

### Type of change

- [x] update test cases
2025-03-26 19:23:13 +08:00
writinwaters
b7d7ad536a
AI search vs. chat (#6569)
### What problem does this PR solve?


### Type of change

- [x] Documentation Update
2025-03-26 18:46:34 +08:00
andy
24d8ff7425
Fix:flow DB Assistant module translate to zh (#6562)
### What problem does this PR solve?

Fix:flow DB Assistant module translate to zh

### Type of change

- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [x] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2025-03-26 17:32:05 +08:00
Chenzy
735d9dd949
Feat: add "tools" to llm_factories.json (#6552)
### What problem does this PR solve?



### Type of change

- [x] New Feature (non-breaking change which adds functionality)

---------

Co-authored-by: Chenzy <chenzy901@gmail.com>
2025-03-26 17:31:18 +08:00
zstar
cc5f4a5efa
Fix: python_api_reference.md update dataset bug (#6527)
### What problem does this PR solve?

There is a small bug in the update dataset of this document. The return
type of rag_oobject.list_datasets is a list type, and the first item
should be taken as' ragflow_stdk.modules.dataset ' DataSet`, Adapt to
the update.

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-26 17:30:09 +08:00
liu an
93c26ae1ef
Test: Added test cases for Update Chunk HTTP API (#6556)
### What problem does this PR solve?

cover [update
chunk](https://ragflow.io/docs/v0.17.2/http_api_reference#update-chunk)
endpoints

### Type of change

- [x] add test cases
2025-03-26 16:47:47 +08:00
Kevin Hu
cc8029a732
Fix: uploading in chat box issue. (#6547)
### What problem does this PR solve?

#6228

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-26 15:37:48 +08:00
Zhichang Yu
6bf26e2a81
Optimize graphrag again (#6513)
### What problem does this PR solve?

Removed set_entity and set_relation to avoid accessing doc engine during
graph computation.
Introduced GraphChange to avoid writing unchanged chunks.

### Type of change

- [x] Performance Improvement
2025-03-26 15:34:42 +08:00
Kevin Hu
7a677cb095
Fix: image_id is None. (#6538)
### What problem does this PR solve?

#6499

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-26 12:04:21 +08:00
Kevin Hu
12ad746ee6
Fix: Bedrock model invocation error. (#6533)
### What problem does this PR solve?


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-26 11:27:12 +08:00
Kevin Hu
163e71d06f
Fix: Hunyuan model adding error. (#6531)
### What problem does this PR solve?

#6523
### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-26 10:33:33 +08:00
Kevin Hu
c8c91fd827
Fix: link to KB from filemanager. (#6530)
### What problem does this PR solve?



### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-26 09:41:14 +08:00
writinwaters
d17970ebd0
0321 chunkmethods (#6520)
### What problem does this PR solve?

#6061 

### Type of change


- [x] Documentation Update
2025-03-26 09:03:18 +08:00
Kevin Hu
bf483fdf02
Fix: describe parameter error. (#6519)
### What problem does this PR solve?
#6228

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-26 09:02:48 +08:00
Kevin Hu
b2b7ed8927
Fix: abnormal chunk id (#6506)
### What problem does this PR solve?

#6500

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-25 19:03:29 +08:00
liu an
0a79dfd5cf
Test: Added test cases for List Chunks HTTP API (#6514)
### What problem does this PR solve?

cover [list
chunks](https://ragflow.io/docs/v0.17.2/http_api_reference#list-chunks)
endpoints

### Type of change

- [x] update test cases
2025-03-25 17:28:58 +08:00
Stephen Hu
1d73baf3d8
Feat: improve '/mv' '/list' API performance (#6502)
### What problem does this PR solve?

1. for /mv API use get by ids to avoid O(n) DB IO

2. for /list remove one useless call
### Type of change

- [x] Performance Improvement
2025-03-25 16:30:25 +08:00
Kevin Hu
f3ae4a3bae
Fix: img_id errror. (#6504)
### What problem does this PR solve?

#6499

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-25 15:57:03 +08:00
liwenju0
814a210f5d
Fix: failed to acquire lock exception with retry mechanism for postgres and mysql (#6483)
Added the with_retry decorator in db_models.py to add a retry mechanism
for database operations. Applied the retry mechanism to the lock and
unlock methods of the PostgresDatabaseLock and MysqlDatabaseLock classes
to enhance the reliability of lock operations.

### What problem does this PR solve?
resolve failed to acquire lock exception with retry mechanism

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)

---------

Co-authored-by: wenju.li <wenju.li@deepctr.cn>
2025-03-25 15:09:56 +08:00
Kevin Hu
60c3a253ad
Fix: api-key issue for xinference. (#6490)
### What problem does this PR solve?

#2792

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-25 15:01:13 +08:00
Kevin Hu
384b6549a6
Fix: remove doc status checking while creating an assistant. (#6486)
### What problem does this PR solve?

#6461

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-25 11:13:22 +08:00
科幻大脑
b2ec39c59d
Fix: Resolve FlowSetting not reading Title from .ts files (#6469)
### What problem does this PR solve?

Fix: Resolve FlowSetting not reading Title from .ts files

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-25 11:07:29 +08:00
Kevin Hu
095fc84cf2
Fix: claude max tokens. (#6484)
### What problem does this PR solve?

#6458

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-25 10:41:55 +08:00
Yongteng Lei
542cf16292
Feat: add project_id and project_name to Langfuse API (#6481)
### What problem does this PR solve?

Enhance Langfuse API: add project_id and project_name

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
2025-03-25 10:36:34 +08:00
liu an
27989eb9a5
Test: Add list chunk checkpoint for the add chunk API (#6482)
### What problem does this PR solve?

Add list chunk checkpoint for the add chunk API

### Type of change

- [x] update test cases
2025-03-25 10:36:21 +08:00
Graf2242
05997e8215
Remove thinking block from keyword node's result (#6474)
### What problem does this PR solve?

For now, if you use thinking model (deepseek-r1:32b with ollama server
in my case) in "Keyword" node, result contains all <think> block and so
node return not only keywords

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
2025-03-25 10:22:41 +08:00
Stephen Hu
5d9afce12d
Feat: improve the performance for '/upload' API (#6479)
### What problem does this PR solve?
improve the logic to fetch parent folder, remove the useless DB IO logic

### Type of change

- [x] Performance Improvement
2025-03-25 10:22:19 +08:00
Yongteng Lei
ee6a0bd9db
Refa: enhancement: enhance the prompt of related_question API (#6463)
### What problem does this PR solve?

Enhance the prompt of `related_question` API.

### Type of change

- [x] Enhancement
- [x] Documentation Update
2025-03-25 10:00:10 +08:00
liu an
b6f3242c6c
Test: Update test cases to reduce execution time (#6470)
### What problem does this PR solve?

_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

### Type of change

- [x] update test cases
2025-03-25 09:17:05 +08:00
utopia2077
390086c6ab
Fix: split process bug in graphrag extract (#6423)
### What problem does this PR solve?

1. miss completion delimiter.
2. miss bracket process.
3. doc_ids return by update_graph is a set, and insert operation in
extract_community need a list.


### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
2025-03-24 21:41:20 +08:00
writinwaters
a40c5aea83
Miscellaneous UI updates (#6471)
### What problem does this PR solve?



### Type of change


- [x] Documentation Update
2025-03-24 19:36:47 +08:00
Stephen Hu
f691b4ddd2
Feat: Improve "/convert" API's performance (#6465)
### What problem does this PR solve?

for batch requests based on get_by_ids to fetch all files first replace
the O(n) IO logic.

### Type of change


- [x] Performance Improvement
2025-03-24 19:08:22 +08:00
balibabu
3c57a9986c
Feat: Add LangfuseCard component. #6155 (#6468)
### What problem does this PR solve?

Feat: Add LangfuseCard component. #6155

### Type of change


- [x] New Feature (non-breaking change which adds functionality)
2025-03-24 19:07:55 +08:00