{/** * @typedef Props * @property {string} apiBaseUrl */} import { CodeGroup } from '@/app/components/develop/code.tsx' import { Row, Col, Properties, Property, Heading, SubProperty, PropertyInstruction, Paragraph } from '@/app/components/develop/md.tsx' # Knowledge API
high_quality
High quality: embedding using embedding model, built as vector database index
- economy
Economy: Build using inverted index of keyword table index
text_model
Text documents are directly embedded; `economy` mode defaults to using this form
- hierarchical_model
Parent-child mode
- qa_model
Q&A Mode: Generates Q&A pairs for segmented documents and then embeds the questions
English
, Chinese
mode
(string) Cleaning, segmentation mode, automatic / custom / hierarchical
- rules
(object) Custom rules (in automatic mode, this field is empty)
- pre_processing_rules
(array[object]) Preprocessing rules
- id
(string) Unique identifier for the preprocessing rule
- enumerate
- remove_extra_spaces
Replace consecutive spaces, newlines, tabs
- remove_urls_emails
Delete URL, email address
- enabled
(bool) Whether to select this rule or not. If no document ID is passed in, it represents the default value.
- segmentation
(object) Segmentation rules
- separator
Custom segment identifier, currently only allows one delimiter to be set. Default is \n
- max_tokens
Maximum length (token) defaults to 1000
- parent_mode
Retrieval mode of parent chunks: full-doc
full text retrieval / paragraph
paragraph retrieval
- subchunk_segmentation
(object) Child chunk rules
- separator
Segmentation identifier. Currently, only one delimiter is allowed. The default is ***
- max_tokens
The maximum length (tokens) must be validated to be shorter than the length of the parent chunk
- chunk_overlap
Define the overlap between adjacent chunks (optional)
search_method
(string) Search method
- hybrid_search
Hybrid search
- semantic_search
Semantic search
- full_text_search
Full-text search
- reranking_enable
(bool) Whether to enable reranking
- reranking_mode
(object) Rerank model configuration
- reranking_provider_name
(string) Rerank model provider
- reranking_model_name
(string) Rerank model name
- top_k
(int) Number of results to return
- score_threshold_enabled
(bool) Whether to enable score threshold
- score_threshold
(float) Score threshold
original_document_id
Source document ID (optional)
- Used to re-upload the document or modify the document cleaning and segmentation configuration. The missing information is copied from the source document
- The source document cannot be an archived document
- When original_document_id is passed in, the update operation is performed on behalf of the document. process_rule is a fillable item. If not filled in, the segmentation method of the source document will be used by default
- When original_document_id is not passed in, the new operation is performed on behalf of the document, and process_rule is required
- indexing_technique
Index mode
- high_quality
High quality: embedding using embedding model, built as vector database index
- economy
Economy: Build using inverted index of keyword table index
- doc_form
Format of indexed content
- text_model
Text documents are directly embedded; `economy` mode defaults to using this form
- hierarchical_model
Parent-child mode
- qa_model
Q&A Mode: Generates Q&A pairs for segmented documents and then embeds the questions
- doc_language
In Q&A mode, specify the language of the document, for example: English
, Chinese
- process_rule
Processing rules
- mode
(string) Cleaning, segmentation mode, automatic / custom / hierarchical
- rules
(object) Custom rules (in automatic mode, this field is empty)
- pre_processing_rules
(array[object]) Preprocessing rules
- id
(string) Unique identifier for the preprocessing rule
- enumerate
- remove_extra_spaces
Replace consecutive spaces, newlines, tabs
- remove_urls_emails
Delete URL, email address
- enabled
(bool) Whether to select this rule or not. If no document ID is passed in, it represents the default value.
- segmentation
(object) Segmentation rules
- separator
Custom segment identifier, currently only allows one delimiter to be set. Default is \n
- max_tokens
Maximum length (token) defaults to 1000
- parent_mode
Retrieval mode of parent chunks: full-doc
full text retrieval / paragraph
paragraph retrieval
- subchunk_segmentation
(object) Child chunk rules
- separator
Segmentation identifier. Currently, only one delimiter is allowed. The default is ***
- max_tokens
The maximum length (tokens) must be validated to be shorter than the length of the parent chunk
- chunk_overlap
Define the overlap between adjacent chunks (optional)
search_method
(string) Search method
- hybrid_search
Hybrid search
- semantic_search
Semantic search
- full_text_search
Full-text search
- reranking_enable
(bool) Whether to enable reranking
- reranking_mode
(object) Rerank model configuration
- reranking_provider_name
(string) Rerank model provider
- reranking_model_name
(string) Rerank model name
- top_k
(int) Number of results to return
- score_threshold_enabled
(bool) Whether to enable score threshold
- score_threshold
(float) Score threshold
high_quality
High quality
- economy
Economy
only_me
Only me
- all_team_members
All team members
- partial_members
Partial members
vendor
Vendor
- external
External knowledge
search_method
(string) Search method
- hybrid_search
Hybrid search
- semantic_search
Semantic search
- full_text_search
Full-text search
- reranking_enable
(bool) Whether to enable reranking
- reranking_model
(object) Rerank model configuration
- reranking_provider_name
(string) Rerank model provider
- reranking_model_name
(string) Rerank model name
- top_k
(int) Number of results to return
- score_threshold_enabled
(bool) Whether to enable score threshold
- score_threshold
(float) Score threshold
high_quality
High quality
- economy
Economy
only_me
Only me
- all_team_members
All team members
- partial_members
Partial members
search_method
(text) Search method: One of the following four keywords is required
- keyword_search
Keyword search
- semantic_search
Semantic search
- full_text_search
Full-text search
- hybrid_search
Hybrid search
- reranking_enable
(bool) Whether to enable reranking, required if the search mode is semantic_search or hybrid_search (optional)
- reranking_mode
(object) Rerank model configuration, required if reranking is enabled
- reranking_provider_name
(string) Rerank model provider
- reranking_model_name
(string) Rerank model name
- weights
(float) Semantic search weight setting in hybrid search mode
- top_k
(integer) Number of results to return (optional)
- score_threshold_enabled
(bool) Whether to enable score threshold
- score_threshold
(float) Score threshold
mode
(string) Cleaning, segmentation mode, automatic / custom / hierarchical
- rules
(object) Custom rules (in automatic mode, this field is empty)
- pre_processing_rules
(array[object]) Preprocessing rules
- id
(string) Unique identifier for the preprocessing rule
- enumerate
- remove_extra_spaces
Replace consecutive spaces, newlines, tabs
- remove_urls_emails
Delete URL, email address
- enabled
(bool) Whether to select this rule or not. If no document ID is passed in, it represents the default value.
- segmentation
(object) Segmentation rules
- separator
Custom segment identifier, currently only allows one delimiter to be set. Default is \n
- max_tokens
Maximum length (token) defaults to 1000
- parent_mode
Retrieval mode of parent chunks: full-doc
full text retrieval / paragraph
paragraph retrieval
- subchunk_segmentation
(object) Child chunk rules
- separator
Segmentation identifier. Currently, only one delimiter is allowed. The default is ***
- max_tokens
The maximum length (tokens) must be validated to be shorter than the length of the parent chunk
- chunk_overlap
Define the overlap between adjacent chunks (optional)
mode
(string) Cleaning, segmentation mode, automatic / custom / hierarchical
- rules
(object) Custom rules (in automatic mode, this field is empty)
- pre_processing_rules
(array[object]) Preprocessing rules
- id
(string) Unique identifier for the preprocessing rule
- enumerate
- remove_extra_spaces
Replace consecutive spaces, newlines, tabs
- remove_urls_emails
Delete URL, email address
- enabled
(bool) Whether to select this rule or not. If no document ID is passed in, it represents the default value.
- segmentation
(object) Segmentation rules
- separator
Custom segment identifier, currently only allows one delimiter to be set. Default is \n
- max_tokens
Maximum length (token) defaults to 1000
- parent_mode
Retrieval mode of parent chunks: full-doc
full text retrieval / paragraph
paragraph retrieval
- subchunk_segmentation
(object) Child chunk rules
- separator
Segmentation identifier. Currently, only one delimiter is allowed. The default is ***
- max_tokens
The maximum length (tokens) must be validated to be shorter than the length of the parent chunk
- chunk_overlap
Define the overlap between adjacent chunks (optional)
content
(text) Text content / question content, required
- answer
(text) Answer content, if the mode of the knowledge is Q&A mode, pass the value (optional)
- keywords
(list) Keywords (optional)
content
(text) Text content / question content, required
- answer
(text) Answer content, passed if the knowledge is in Q&A mode (optional)
- keywords
(list) Keyword (optional)
- enabled
(bool) False / true (optional)
- regenerate_child_chunks
(bool) Whether to regenerate child chunks (optional)
search_method
(text) Search method: One of the following four keywords is required
- keyword_search
Keyword search
- semantic_search
Semantic search
- full_text_search
Full-text search
- hybrid_search
Hybrid search
- reranking_enable
(bool) Whether to enable reranking, required if the search mode is semantic_search or hybrid_search (optional)
- reranking_mode
(object) Rerank model configuration, required if reranking is enabled
- reranking_provider_name
(string) Rerank model provider
- reranking_model_name
(string) Rerank model name
- weights
(float) Semantic search weight setting in hybrid search mode
- top_k
(integer) Number of results to return (optional)
- score_threshold_enabled
(bool) Whether to enable score threshold
- score_threshold
(float) Score threshold
type
(string) Metadata type, required
- name
(string) Metadata name, required
name
(string) Metadata name, required
document_id
(string) Document ID
- metadata_list
(list) Metadata list
- id
(string) Metadata ID
- value
(string) Metadata value
- name
(string) Metadata name
code | status | message |
---|---|---|
no_file_uploaded | 400 | Please upload your file. |
too_many_files | 400 | Only one file is allowed. |
file_too_large | 413 | File size exceeded. |
unsupported_file_type | 415 | File type not allowed. |
high_quality_dataset_only | 400 | Current operation only supports 'high-quality' datasets. |
dataset_not_initialized | 400 | The dataset is still being initialized or indexing. Please wait a moment. |
archived_document_immutable | 403 | The archived document is not editable. |
dataset_name_duplicate | 409 | The dataset name already exists. Please modify your dataset name. |
invalid_action | 400 | Invalid action. |
document_already_finished | 400 | The document has been processed. Please refresh the page or go to the document details. |
document_indexing | 400 | The document is being processed and cannot be edited. |
invalid_metadata | 400 | The metadata content is incorrect. Please check and verify. |