### What problem does this PR solve?
Removed set_entity and set_relation to avoid accessing doc engine during
graph computation.
Introduced GraphChange to avoid writing unchanged chunks.
### Type of change
- [x] Performance Improvement
### What problem does this PR solve?
This patch add signal for ctrl + c that can exit the code friendly
cause code base use thread daemon can not exit friendly for being
started.
how to reproduce
1. docker-compose -f docker/docker-compose-base.yml up
2. other window `bash docker/launch_backend_service.sh`
3. stop 1 first
4. try to stop 2 then two thread can not exit which must use `kill pid`
This patch fix it
and should fix most the related issues in the `issues`
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
---------
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] Refactoring
---------
Signed-off-by: jinhai <haijin.chn@gmail.com>
### What problem does this PR solve?
The beartype configuration of
main(64f50992e0fc4dce73e79f8b951a02e31cb2d638) is:
```
from beartype import BeartypeConf
from beartype.claw import beartype_all # <-- you didn't sign up for this
beartype_all(conf=BeartypeConf(violation_type=UserWarning)) # <-- emit warnings from all code
```
ragflow_server failed at a third-party package:
```
(ragflow-py3.10) zhichyu@iris:~/github.com/infiniflow/ragflow$ rm -rf logs/* && bash docker/launch_backend_service.sh
Starting task_executor.py for task 0 (Attempt 1)
Starting ragflow_server.py (Attempt 1)
Traceback (most recent call last):
File "/home/zhichyu/github.com/infiniflow/ragflow/api/ragflow_server.py", line 22, in <module>
from api.utils.log_utils import initRootLogger
File "/home/zhichyu/github.com/infiniflow/ragflow/api/utils/__init__.py", line 25, in <module>
import requests
File "/home/zhichyu/github.com/infiniflow/ragflow/.venv/lib/python3.10/site-packages/requests/__init__.py", line 43, in <module>
import urllib3
File "/home/zhichyu/github.com/infiniflow/ragflow/.venv/lib/python3.10/site-packages/urllib3/__init__.py", line 15, in <module>
from ._base_connection import _TYPE_BODY
File "/home/zhichyu/github.com/infiniflow/ragflow/.venv/lib/python3.10/site-packages/urllib3/_base_connection.py", line 5, in <module>
from .util.connection import _TYPE_SOCKET_OPTIONS
File "/home/zhichyu/github.com/infiniflow/ragflow/.venv/lib/python3.10/site-packages/urllib3/util/__init__.py", line 4, in <module>
from .connection import is_connection_dropped
File "/home/zhichyu/github.com/infiniflow/ragflow/.venv/lib/python3.10/site-packages/urllib3/util/connection.py", line 7, in <module>
from .timeout import _DEFAULT_TIMEOUT, _TYPE_TIMEOUT
File "/home/zhichyu/github.com/infiniflow/ragflow/.venv/lib/python3.10/site-packages/urllib3/util/timeout.py", line 20, in <module>
_DEFAULT_TIMEOUT: Final[_TYPE_DEFAULT] = _TYPE_DEFAULT.token
NameError: name 'Final' is not defined
Traceback (most recent call last):
File "/home/zhichyu/github.com/infiniflow/ragflow/rag/svr/task_executor.py", line 22, in <module>
from api.utils.log_utils import initRootLogger
File "/home/zhichyu/github.com/infiniflow/ragflow/api/utils/__init__.py", line 25, in <module>
import requests
File "/home/zhichyu/github.com/infiniflow/ragflow/.venv/lib/python3.10/site-packages/requests/__init__.py", line 43, in <module>
import urllib3
File "/home/zhichyu/github.com/infiniflow/ragflow/.venv/lib/python3.10/site-packages/urllib3/__init__.py", line 15, in <module>
from ._base_connection import _TYPE_BODY
File "/home/zhichyu/github.com/infiniflow/ragflow/.venv/lib/python3.10/site-packages/urllib3/_base_connection.py", line 5, in <module>
from .util.connection import _TYPE_SOCKET_OPTIONS
File "/home/zhichyu/github.com/infiniflow/ragflow/.venv/lib/python3.10/site-packages/urllib3/util/__init__.py", line 4, in <module>
from .connection import is_connection_dropped
File "/home/zhichyu/github.com/infiniflow/ragflow/.venv/lib/python3.10/site-packages/urllib3/util/connection.py", line 7, in <module>
from .timeout import _DEFAULT_TIMEOUT, _TYPE_TIMEOUT
File "/home/zhichyu/github.com/infiniflow/ragflow/.venv/lib/python3.10/site-packages/urllib3/util/timeout.py", line 20, in <module>
_DEFAULT_TIMEOUT: Final[_TYPE_DEFAULT] = _TYPE_DEFAULT.token
NameError: name 'Final' is not defined
```
This third-package is out of our control. I have to remove beartype
entirely.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Warning instead of exception on type mismatch.
### Type of change
- [x] Bug Fix (non-breaking change which fixes an issue)
### What problem does this PR solve?
Introduced [beartype](https://github.com/beartype/beartype) for runtime
type-checking.
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
### What problem does this PR solve?
1. Module init won't connect database any more.
2. Config in settings need to be used with settings.CONFIG_NAME
### Type of change
- [x] Refactoring
Signed-off-by: jinhai <haijin.chn@gmail.com>
### What problem does this PR solve?
Rework task executor heartbeat, and print in console.
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [x] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._
### Type of change
- [x] Refactoring
---------
Signed-off-by: jinhai <haijin.chn@gmail.com>
### What problem does this PR solve?
Use consistent log file names, introduced initLogger
### Type of change
- [ ] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Documentation Update
- [x] Refactoring
- [ ] Performance Improvement
- [ ] Other (please describe):
### What problem does this PR solve?
Print version when RAGFlow server startup
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
---------
Signed-off-by: jinhai <haijin.chn@gmail.com>
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>
### What problem does this PR solve?
1. Validate the Python version should >= 3.11
2. Download nltk data
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
- [x] Refactoring
---------
Signed-off-by: jinhai <haijin.chn@gmail.com>
Co-authored-by: jinhai <haijin.chn@gmail.com>
Co-authored-by: Zhichang Yu <yuzhichang@gmail.com>
### What problem does this PR solve?
**Added openapi specification for API routes. This creates swagger UI
similar to FastAPI to better use the API.**
Using python package `flasgger`
### Type of change
- [x] New Feature (non-breaking change which adds functionality)
Not all routes are included since this is a work in progress.
Docs can be accessed on: `{host}:{port}/apidocs`
### What problem does this PR solve?
Move the build image and launch from source back to README.
### Type of change
- [x] Documentation Update
---------
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Related source file is in Windows/DOS format, they are format to Unix
format.
### Type of change
- [x] Refactoring
Signed-off-by: Jin Hai <haijin.chn@gmail.com>
### What problem does this PR solve?
Optimize task broker and executor for reduce memory usage and deployment
complexity.
### Type of change
- [x] Performance Improvement
- [x] Refactoring
### Change Log
- Enhance redis utils for message queue(use stream)
- Modify task broker logic via message queue (1.get parse event from
message queue 2.use ThreadPoolExecutor async executor )
- Modify the table column name of document and task (process_duation ->
process_duration maybe just a spelling mistake)
- Reformat some code style(just what i see)
- Add requirement_dev.txt for developer
- Add redis container on docker compose
---------
Co-authored-by: Kevin Hu <kevinhu.sh@gmail.com>