Updated HTTP API reference and Python API reference based on test results (#3090)

### What problem does this PR solve?



### Type of change


- [x] Documentation Update
This commit is contained in:
writinwaters 2024-10-29 19:56:46 +08:00 committed by GitHub
parent d868c283c4
commit f4cb939317
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
2 changed files with 13 additions and 12 deletions

View File

@ -94,8 +94,10 @@ curl --request POST \
The configuration settings for the dataset parser, a JSON object containing the following attributes:
- `"chunk_token_count"`: Defaults to `128`.
- `"layout_recognize"`: Defaults to `true`.
- `"html4excel"`: Indicates whether to convert Excel documents into HTML format. Defaults to `false`.
- `"delimiter"`: Defaults to `"\n!?。;!?"`.
- `"task_page_size"`: Defaults to `12`.
- `"task_page_size"`: Defaults to `12`. For PDF only.
- `"raptor"`: Raptor-specific settings. Defaults to: `{"use_raptor": false}`.
### Response
@ -177,7 +179,7 @@ curl --request DELETE \
#### Request parameters
- `"ids"`: (*Body parameter*), `list[string]`
- `"ids"`: (*Body parameter*), `list[string]`
The IDs of the datasets to delete. If it is not specified, all datasets will be deleted.
### Response
@ -241,7 +243,7 @@ curl --request PUT \
- `"embedding_model"`: (*Body parameter*), `string`
The updated embedding model name.
- Ensure that `"chunk_count"` is `0` before updating `"embedding_model"`.
- `"chunk_method"`: (*Body parameter*), `enum<string>`
- `"chunk_method"`: (*Body parameter*), `enum<string>`
The chunking method for the dataset. Available options:
- `"naive"`: General
- `"manual`: Manual
@ -510,12 +512,12 @@ curl --request PUT \
- `"one"`: One
- `"knowledge_graph"`: Knowledge Graph
- `"email"`: Email
- `"parser_config"`: (*Body parameter*), `object`
- `"parser_config"`: (*Body parameter*), `object`
The parsing configuration for the document:
- `"chunk_token_count"`: Defaults to `128`.
- `"layout_recognize"`: Defaults to `true`.
- `"delimiter"`: Defaults to `"\n!?。;!?"`.
- `"task_page_size"`: Defaults to `12`.
- `"task_page_size"`: Defaults to `12`. For PDF only.
### Response
@ -718,7 +720,7 @@ curl --request DELETE \
- `dataset_id`: (*Path parameter*)
The associated dataset ID.
- `"ids"`: (*Body parameter*), `list[string]`
- `"ids"`: (*Body parameter*), `list[string]`
The IDs of the documents to delete. If it is not specified, all documents in the specified dataset will be deleted.
### Response
@ -1169,7 +1171,7 @@ Failure:
## Retrieve chunks
**GET** `/api/v1/retrieval`
**POST** `/api/v1/retrieval`
Retrieves chunks from specified datasets.

View File

@ -1253,7 +1253,7 @@ Asks a question to start an AI-powered conversation.
#### question: `str` *Required*
The question to start an AI chat.
The question to start an AI-powered conversation.
#### stream: `bool`
@ -1286,7 +1286,7 @@ A list of `Chunk` objects representing references to the message, each containin
- `content` `str`
The content of the chunk.
- `image_id` `str`
The ID of the snapshot of the chunk.
The ID of the snapshot of the chunk. Applicable only when the source of the chunk is an image, PPT, PPTX, or PDF file.
- `document_id` `str`
The ID of the referenced document.
- `document_name` `str`
@ -1295,14 +1295,13 @@ A list of `Chunk` objects representing references to the message, each containin
The location information of the chunk within the referenced document.
- `dataset_id` `str`
The ID of the dataset to which the referenced document belongs.
- `similarity` `float`
A composite similarity score of the chunk ranging from `0` to `1`, with a higher value indicating greater similarity.
- `similarity` `float`
A composite similarity score of the chunk ranging from `0` to `1`, with a higher value indicating greater similarity. It is the weighted sum of `vector_similarity` and `term_similarity`.
- `vector_similarity` `float`
A vector similarity score of the chunk ranging from `0` to `1`, with a higher value indicating greater similarity between vector embeddings.
- `term_similarity` `float`
A keyword similarity score of the chunk ranging from `0` to `1`, with a higher value indicating greater similarity between keywords.
### Examples
```python