Let task continue dispaching while meeting unexpected doc formats (#199)

### What problem does this PR solve?

_Briefly describe what this PR aims to solve. Include background context
that will help reviewers understand the purpose of the PR._

Issue link:#[[Link the issue
here](https://github.com/infiniflow/ragflow/issues/198)]

### Type of change

- [x] Bug Fix (non-breaking change which fixes an issue)
- [ ] New Feature (non-breaking change which adds functionality)
- [ ] Breaking Change (fix or feature that could cause existing
functionality not to work as expected)
- [ ] Documentation Update
- [ ] Refactoring
- [ ] Performance Improvement
- [ ] Test cases
- [ ] Python SDK impacted, Need to update PyPI
- [ ] Other (please describe):
This commit is contained in:
KevinHuSh 2024-04-02 11:39:01 +08:00 committed by GitHub
parent 36f2d7b797
commit 572e5b1ff1
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -73,7 +73,7 @@ def dispatch():
for t in tsks:
TaskService.delete_by_id(t.id)
except Exception as e:
cron_logger.error("delete task exception:" + str(e))
cron_logger.exception(e)
def new_task():
nonlocal r
@ -83,6 +83,7 @@ def dispatch():
}
tsks = []
try:
if r["type"] == FileType.PDF.value:
do_layout = r["parser_config"].get("layout_recognize", True)
pages = PdfParser.total_page_number(
@ -121,6 +122,9 @@ def dispatch():
bulk_insert_into_db(Task, tsks, True)
set_dispatching(r["id"])
except Exception as e:
cron_logger.exception(e)
tmf.write(str(r["update_time"]) + "\n")
tmf.close()