From 35f13e882ec7944c873a5298a57f362338209fc1 Mon Sep 17 00:00:00 2001
From: Jin Hai
Supported file formats are XLSX and CSV/TXT.
- Here're some prerequisites and tips: + Here are some prerequisites and tips:
This approach chunks files using the 'naive'/'General' method. It splits a document into segments and then combines adjacent segments until the token count exceeds the threshold specified by 'Chunk token number', at which point a chunk is created.
The chunks are then fed to the LLM to extract entities and relationships for a knowledge graph and a mind map.
Ensure that you set the Entity types.
`, - tag: `Knowlege base using 'Tag' as a chunking method is supposed to be used by other knowledge bases to add tags to their chunks, queries to which will also be with tags too.
-Knowlege base using 'Tag' as a chunking method is NOT supposed to be involved in RAG procedure.
+ tag: `Knowledge base using 'Tag' as a chunking method is supposed to be used by other knowledge bases to add tags to their chunks, queries to which will also be with tags too.
+Knowledge base using 'Tag' as a chunking method is NOT supposed to be involved in RAG procedure.
The chunks in this knowledge base are examples of tags, which demonstrate the entire tag set and the relevance between chunk and tags.
This chunk method supports XLSX and CSV/TXT file formats.
If a file is in XLSX format, it should contain two columns without headers: one for content and the other for tags, with the content column preceding the tags column. Multiple sheets are acceptable, provided the columns are properly structured.
If a file is in CSV/TXT format, it must be UTF-8 encoded with TAB as the delimiter to separate content and tags.
-In tags column, there're English comma between tags.
+In tags column, there are English comma between tags.
Lines of texts that fail to follow the above rules will be ignored, and each pair will be considered a distinct chunk. `, useRaptor: 'Use RAPTOR to enhance retrieval', @@ -359,7 +359,7 @@ The above is the content you need to summarize.`, This auto-tag feature enhances retrieval by adding another layer of domain-specific knowledge to the existing dataset.Difference between auto-tag and auto-keyword: