### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] Refactoring
---------
Signed-off-by: jinhai <haijin.chn@gmail.com>
### What problem does this PR solve?
1. Store error type in Infinity
2. position list value read from Infinity isn't correct.
Fix issue: #3729
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Signed-off-by: jinhai <haijin.chn@gmail.com>
### What problem does this PR solve?
Don't log exception if object doesn't exist. Close#1483
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Test cases about dataset
### Type of change
- [x] Other (please describe): test cases
---------
Signed-off-by: jinhai <haijin.chn@gmail.com>
### What problem does this PR solve?
Detect shape error of embedding. Close#2997
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
**Passing cv_mdl.describe() is not an RGB converted image**
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Edit chunk shall update instead of insert it. Close#3679
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Always open text file for write with UTF-8. Close#932
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Fix enable/disable bug #3628
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Co-authored-by: liuhua <10215101452@stu.ecun.edu.cn>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
Fix a bug in VolcEngine #3553
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: liuhua <10215101452@stu.ecun.edu.cn>
### What problem does this PR solve?
Fix the bug causing garbled text #3613
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: liuhua <10215101452@stu.ecun.edu.cn>
### What problem does this PR solve?
Handle infinity empty response. Close#3623
Show version in docker build log
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
When calling the Qwen rerank model, if the model does not return
correctly, an exception should be raised to notify the user, rather than
simply returning a value of 0, as this would be confusing to the user.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Added TRACE_MALLOC_DELTA and TRACE_MALLOC_FULL to debug task_executor.py
heap. Relates to #3518
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Check tika.parser return result. Close#3229
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
Co-authored-by: Yingfeng <yingfeng.zhang@gmail.com>
### What problem does this PR solve?
The beartype configuration of
main(64f50992e0fc4dce73e79f8b951a02e31cb2d638) is:
```
from beartype import BeartypeConf
from beartype.claw import beartype_all # <-- you didn't sign up for this
beartype_all(conf=BeartypeConf(violation_type=UserWarning)) # <-- emit warnings from all code
```
ragflow_server failed at a third-party package:
```
(ragflow-py3.10) zhichyu@iris:~/github.com/infiniflow/ragflow$ rm -rf logs/* && bash docker/launch_backend_service.sh
Starting task_executor.py for task 0 (Attempt 1)
Starting ragflow_server.py (Attempt 1)
Traceback (most recent call last):
File "/home/zhichyu/github.com/infiniflow/ragflow/api/ragflow_server.py", line 22, in <module>
from api.utils.log_utils import initRootLogger
File "/home/zhichyu/github.com/infiniflow/ragflow/api/utils/__init__.py", line 25, in <module>
import requests
File "/home/zhichyu/github.com/infiniflow/ragflow/.venv/lib/python3.10/site-packages/requests/__init__.py", line 43, in <module>
import urllib3
File "/home/zhichyu/github.com/infiniflow/ragflow/.venv/lib/python3.10/site-packages/urllib3/__init__.py", line 15, in <module>
from ._base_connection import _TYPE_BODY
File "/home/zhichyu/github.com/infiniflow/ragflow/.venv/lib/python3.10/site-packages/urllib3/_base_connection.py", line 5, in <module>
from .util.connection import _TYPE_SOCKET_OPTIONS
File "/home/zhichyu/github.com/infiniflow/ragflow/.venv/lib/python3.10/site-packages/urllib3/util/__init__.py", line 4, in <module>
from .connection import is_connection_dropped
File "/home/zhichyu/github.com/infiniflow/ragflow/.venv/lib/python3.10/site-packages/urllib3/util/connection.py", line 7, in <module>
from .timeout import _DEFAULT_TIMEOUT, _TYPE_TIMEOUT
File "/home/zhichyu/github.com/infiniflow/ragflow/.venv/lib/python3.10/site-packages/urllib3/util/timeout.py", line 20, in <module>
_DEFAULT_TIMEOUT: Final[_TYPE_DEFAULT] = _TYPE_DEFAULT.token
NameError: name 'Final' is not defined
Traceback (most recent call last):
File "/home/zhichyu/github.com/infiniflow/ragflow/rag/svr/task_executor.py", line 22, in <module>
from api.utils.log_utils import initRootLogger
File "/home/zhichyu/github.com/infiniflow/ragflow/api/utils/__init__.py", line 25, in <module>
import requests
File "/home/zhichyu/github.com/infiniflow/ragflow/.venv/lib/python3.10/site-packages/requests/__init__.py", line 43, in <module>
import urllib3
File "/home/zhichyu/github.com/infiniflow/ragflow/.venv/lib/python3.10/site-packages/urllib3/__init__.py", line 15, in <module>
from ._base_connection import _TYPE_BODY
File "/home/zhichyu/github.com/infiniflow/ragflow/.venv/lib/python3.10/site-packages/urllib3/_base_connection.py", line 5, in <module>
from .util.connection import _TYPE_SOCKET_OPTIONS
File "/home/zhichyu/github.com/infiniflow/ragflow/.venv/lib/python3.10/site-packages/urllib3/util/__init__.py", line 4, in <module>
from .connection import is_connection_dropped
File "/home/zhichyu/github.com/infiniflow/ragflow/.venv/lib/python3.10/site-packages/urllib3/util/connection.py", line 7, in <module>
from .timeout import _DEFAULT_TIMEOUT, _TYPE_TIMEOUT
File "/home/zhichyu/github.com/infiniflow/ragflow/.venv/lib/python3.10/site-packages/urllib3/util/timeout.py", line 20, in <module>
_DEFAULT_TIMEOUT: Final[_TYPE_DEFAULT] = _TYPE_DEFAULT.token
NameError: name 'Final' is not defined
```
This third-package is out of our control. I have to remove beartype
entirely.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Warning instead of exception on type mismatch.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
1. Remove unused code
2. Fix type mismatch, in nlp search and infinity search interface
3. Fix chunk list, get all chunks of this user.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Signed-off-by: jinhai <haijin.chn@gmail.com>
### What problem does this PR solve?
_Choosing Laws Chunk Method results in an error when parsing a document.
The error is caused by a missing import in the `laws.py` file._
```
Traceback (most recent call last):
File "/ragflow/rag/svr/task_executor.py", line 445, in handle_task
do_handle_task(task)
File "/ragflow/rag/svr/task_executor.py", line 384, in do_handle_task
cks = build(r)
^^^^^^^^
File "/ragflow/rag/svr/task_executor.py", line 196, in build
cks = chunker.chunk(row["name"], binary=binary, from_page=row["from_page"],
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/ragflow/rag/app/laws.py", line 161, in chunk
for txt, poss in pdf_parser(filename if not binary else binary,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/ragflow/rag/app/laws.py", line 124, in __call__
logging.debug("layouts:".format(
^^^^^^^
NameError: name 'logging' is not defined. Did you forget to import 'logging'
```
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
Co-authored-by: Michal Masrna <m.marna1@gmail.com>
### What problem does this PR solve?
Changed requirement to python 3.10.
Changed image base to Ubuntu 22.04 since it contains python 3.10.
### Type of change
- [x] Refactoring
### What problem does this PR solve?
Add api for sessions and add max_tokens for tenant_llm
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Co-authored-by: liuhua <10215101452@stu.ecun.edu.cn>
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Signed-off-by: jinhai <haijin.chn@gmail.com>
### What problem does this PR solve?
1. Fix error in viewing document chunk

<img width="1677" alt="Pasted Graphic"
src="https://github.com/user-attachments/assets/acd84cde-f38c-4190-b135-5e5139ae2613">
Viewing document chunk details in a BeartypeCallHintParamViolation
error.
Traceback (most recent call last):
File "ragflow/.venv/lib/python3.12/site-packages/flask/app.py", line
880, in full_dispatch_request
rv = self.dispatch_request()
^^^^^^^^^^^^^^^^^^^^^^^
File "ragflow/.venv/lib/python3.12/site-packages/flask/app.py", line
865, in dispatch_request
return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
# type: ignore[no-any-return]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "ragflow/.venv/lib/python3.12/site-packages/flask_login/utils.py",
line 290, in decorated_view
return current_app.ensure_sync(func)(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "ragflow/api/apps/chunk_app.py", line 311, in knowledge_graph
sres = settings.retrievaler.search(req, search.index_name(tenant_id),
kb_ids)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "<@beartype(rag.nlp.search.Dealer.search) at 0x3381fd800>", line
39, in search
beartype.roar.BeartypeCallHintParamViolation: Method
rag.nlp.search.Dealer.search() parameter
idx_names='ragflow_0e1e67f431d711ef98fc00155d29195d' violates type hint
list[str], as str 'ragflow_0e1e67f431d711ef98fc00155d29195d' not
instance of list.
2024-11-19 11:30:29,817 ERROR 91013 Method
rag.nlp.search.Dealer.search() parameter
idx_names='ragflow_0e1e67f431d711ef98fc00155d29195d' violates type hint
list[str], as str 'ragflow_0e1e67f431d711ef98fc00155d29195d' not
instance of list.
Traceback (most recent call last):
File "ragflow/api/apps/chunk_app.py", line 60, in list_chunk
sres = settings.retrievaler.search(query, search.index_name(tenant_id),
kb_ids, highlight=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "<@beartype(rag.nlp.search.Dealer.search) at 0x3381fd800>", line
39, in search
beartype.roar.BeartypeCallHintParamViolation: Method
rag.nlp.search.Dealer.search() parameter
idx_names='ragflow_0e1e67f431d711ef98fc00155d29195d' violates type hint
list[str], as str 'ragflow_0e1e67f431d711ef98fc00155d29195d' not
instance of list.
because in nlp/search.py,the idx_names is only list

<img width="1098" alt="Pasted Graphic 2"
src="https://github.com/user-attachments/assets/4998cb1e-94bc-470b-b2f4-41ecb5b08f8a">
but the DocStoreConnection.search method accept list or str
<img width="1175" alt="Pasted Graphic 3"
src="https://github.com/user-attachments/assets/ee918b4a-87a5-42c9-a6d2-d0db0884b875">

and his implements also list and str
es_conn.py

<img width="1121" alt="Pasted Graphic 4"
src="https://github.com/user-attachments/assets/3e6dc030-0a0d-416c-8fd4-0b4cfd576f8c">
infinity_conn.py

<img width="1221" alt="Pasted Graphic 5"
src="https://github.com/user-attachments/assets/44edac2b-6b81-45b0-a3fc-cb1c63219015">
2. Fix cannot star task_executor server with Unresolved reference
'Mapping'
<img width="1283" alt="Pasted Graphic 6"
src="https://github.com/user-attachments/assets/421f17b8-d0a5-46d3-bc4d-d05dc9dfc934">
### Type of change
- [X] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Fix logs. Use dict.pop instead of del.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Introduced [beartype](https://github.com/beartype/beartype) for runtime
type-checking.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
Hi there!
LocalAI added support of rerank models
https://localai.io/features/reranker/
I've implemented LocalAIRerank class (typically copied it from
OpenAI_APIRerank class).
Also, LocalAI model response with 500 error code if len of "documents"
is less than 2 in similarity check.
So I've added the second "document" on RERANK model connection check in
`api/apps/llm_app.py`.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
1. Module init won't connect database any more.
2. Config in settings need to be used with settings.CONFIG_NAME
### Type of change
- [x] Refactoring
Signed-off-by: jinhai <haijin.chn@gmail.com>