feat: support markdown files (#483)

parse markdown files as txt

### What problem does this PR solve?

support markdown files

### Type of change

- [x] New Feature (non-breaking change which adds functionality)
This commit is contained in:
Shaun 2024-04-22 14:43:36 +08:00 committed by GitHub
parent b8e58fe27a
commit 11949f9f2e
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -137,7 +137,7 @@ def chunk(filename, binary=None, from_page=0, to_page=100000,
excel_parser = ExcelParser()
sections = [(excel_parser.html(binary), "")]
elif re.search(r"\.txt$", filename, re.IGNORECASE):
elif re.search(r"\.(txt|md)$", filename, re.IGNORECASE):
callback(0.1, "Start to parse.")
txt = ""
if binary: