DRAFT: Miscellaneous updates to HTTP API. Tried to finish off Python API ref… (#2909)

…erence but failed.

### What problem does this PR solve?



### Type of change


- [x] Documentation Update
This commit is contained in:
writinwaters 2024-10-21 09:47:59 +08:00 committed by GitHub
parent 1d6bcf5aa2
commit 5aec1e3e17
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
2 changed files with 622 additions and 474 deletions

File diff suppressed because it is too large Load Diff

View File

@ -380,7 +380,7 @@ The ID of the document to retrieve. Defaults to `None`.
#### keywords: `str` #### keywords: `str`
The keywords to match document titles. Defaults to `None`. The keywords used to match document titles. Defaults to `None`.
#### offset: `int` #### offset: `int`
@ -419,7 +419,7 @@ A `Document` object contains the following attributes:
- `created_by`: `str` The creator of the document. Defaults to `""`. - `created_by`: `str` The creator of the document. Defaults to `""`.
- `size`: `int` The document size in bytes. Defaults to `0`. - `size`: `int` The document size in bytes. Defaults to `0`.
- `token_count`: `int` The number of tokens in the document. Defaults to `0`. - `token_count`: `int` The number of tokens in the document. Defaults to `0`.
- `chunk_count`: `int` The number of chunks that the document is split into. Defaults to `0`. - `chunk_count`: `int` The number of chunks in the document. Defaults to `0`.
- `progress`: `float` The current processing progress as a percentage. Defaults to `0.0`. - `progress`: `float` The current processing progress as a percentage. Defaults to `0.0`.
- `progress_msg`: `str` A message indicating the current progress status. Defaults to `""`. - `progress_msg`: `str` A message indicating the current progress status. Defaults to `""`.
- `process_begin_at`: `datetime` The start time of document processing. Defaults to `None`. - `process_begin_at`: `datetime` The start time of document processing. Defaults to `None`.
@ -432,8 +432,8 @@ A `Document` object contains the following attributes:
```python ```python
from ragflow import RAGFlow from ragflow import RAGFlow
rag = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380") rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
dataset = rag.create_dataset(name="kb_1") dataset = rag_object.create_dataset(name="kb_1")
filename1 = "~/ragflow.txt" filename1 = "~/ragflow.txt"
blob = open(filename1 , "rb").read() blob = open(filename1 , "rb").read()
@ -482,6 +482,8 @@ dataset.delete_documents(ids=["id_1","id_2"])
DataSet.async_parse_documents(document_ids:list[str]) -> None DataSet.async_parse_documents(document_ids:list[str]) -> None
``` ```
Parses documents in the current dataset.
### Parameters ### Parameters
#### document_ids: `list[str]`, *Required* #### document_ids: `list[str]`, *Required*
@ -490,7 +492,7 @@ The IDs of the documents to parse.
### Returns ### Returns
- Success: No value is returned. - Success: No value is returned.????????????????????
- Failure: `Exception` - Failure: `Exception`
### Examples ### Examples
@ -520,6 +522,8 @@ print("Async bulk parsing initiated.")
DataSet.async_cancel_parse_documents(document_ids:list[str])-> None DataSet.async_cancel_parse_documents(document_ids:list[str])-> None
``` ```
Stops parsing specified documents.
### Parameters ### Parameters
#### document_ids: `list[str]`, *Required* #### document_ids: `list[str]`, *Required*
@ -554,27 +558,80 @@ print("Async bulk parsing cancelled.")
--- ---
## Add chunk
```python
Document.add_chunk(content:str) -> Chunk ?????????????????????
```
Adds a chunk to the current document.
### Parameters
#### content: `str`, *Required*
The text content of the chunk.
#### important_keywords: `list[str]` ??????????????????????
The key terms or phrases to tag with the chunk.
### Returns
- Success: A `Chunk` object.
- Failure: `Exception`.
A `Chunk` object contains the following attributes:
- `id`: `str`
- `content`: `str` Content of the chunk.
- `important_keywords`: `list[str]` A list of key terms or phrases to tag with the chunk.
- `create_time`: `str` The time when the chunk was created (added to the document).
- `create_timestamp`: `float` The timestamp representing the creation time of the chunk, expressed in seconds since January 1, 1970.
- `knowledgebase_id`: `str` The ID of the associated dataset.
- `document_name`: `str` The name of the associated document.
- `document_id`: `str` The ID of the associated document.
- `available`: `int`???? The chunk's availability status in the dataset. Value options:
- `0`: Unavailable
- `1`: Available
### Examples
```python
from ragflow import RAGFlow
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
dataset = rag_object.list_datasets(id="123")
dtaset = dataset[0]
doc = dataset.list_documents(id="wdfxb5t547d")
doc = doc[0]
chunk = doc.add_chunk(content="xxxxxxx")
```
---
## List chunks ## List chunks
```python ```python
Document.list_chunks(keywords: str = None, offset: int = 0, limit: int = -1, id : str = None) -> list[Chunk] Document.list_chunks(keywords: str = None, offset: int = 0, limit: int = -1, id : str = None) -> list[Chunk]
``` ```
Retrieves a list of document chunks. Retrieves a list of chunks from the current document.
### Parameters ### Parameters
#### keywords: `str` #### keywords: `str`
List chunks whose name has the given keywords. Defaults to `None` The keywords used to match chunk content. Defaults to `None`
#### offset: `int` #### offset: `int`
The starting index for the chunks to retrieve. Defaults to `1` The starting index for the chunks to retrieve. Defaults to `1`??????
#### limit #### limit
The maximum number of chunks to retrieve. Default: `30` The maximum number of chunks to retrieve. Default: `30`?????????
#### id #### id
@ -598,42 +655,9 @@ for chunk in doc.list_chunks(keywords="rag", offset=0, limit=12):
print(chunk) print(chunk)
``` ```
## Add chunk
```python
Document.add_chunk(content:str) -> Chunk
```
### Parameters
#### content: *Required*
The text content of the chunk.
#### important_keywords :`list[str]`
List the key terms or phrases that are significant or central to the chunk's content.
### Returns
chunk
### Examples
```python
from ragflow import RAGFlow
rag = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
dataset = rag.list_datasets(id="123")
dtaset = dataset[0]
doc = dataset.list_documents(id="wdfxb5t547d")
doc = doc[0]
chunk = doc.add_chunk(content="xxxxxxx")
```
--- ---
## Delete chunk ## Delete chunks
```python ```python
Document.delete_chunks(chunk_ids: list[str]) Document.delete_chunks(chunk_ids: list[str])
@ -657,10 +681,10 @@ The IDs of the chunks to delete. Defaults to `None`. If not specified, all chunk
```python ```python
from ragflow import RAGFlow from ragflow import RAGFlow
rag = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380") rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
ds = rag.list_datasets(id="123") dataset = rag_object.list_datasets(id="123")
ds = ds[0] dataset = dataset[0]
doc = ds.list_documents(id="wdfxb5t547d") doc = dataset.list_documents(id="wdfxb5t547d")
doc = doc[0] doc = doc[0]
chunk = doc.add_chunk(content="xxxxxxx") chunk = doc.add_chunk(content="xxxxxxx")
doc.delete_chunks(["id_1","id_2"]) doc.delete_chunks(["id_1","id_2"])
@ -683,7 +707,7 @@ Updates content or configurations for the current chunk.
A dictionary representing the attributes to update, with the following keys: A dictionary representing the attributes to update, with the following keys:
- `"content"`: `str` Content of the chunk. - `"content"`: `str` Content of the chunk.
- `"important_keywords"`: `list[str]` A list of key terms to attach to the chunk. - `"important_keywords"`: `list[str]` A list of key terms or phrases to tag with the chunk.
- `"available"`: `int` The chunk's availability status in the dataset. Value options: - `"available"`: `int` The chunk's availability status in the dataset. Value options:
- `0`: Unavailable - `0`: Unavailable
- `1`: Available - `1`: Available
@ -715,13 +739,15 @@ chunk.update({"content":"sdfx..."})
RAGFlow.retrieve(question:str="", datasets:list[str]=None, document=list[str]=None, offset:int=1, limit:int=30, similarity_threshold:float=0.2, vector_similarity_weight:float=0.3, top_k:int=1024,rerank_id:str=None,keyword:bool=False,higlight:bool=False) -> list[Chunk] RAGFlow.retrieve(question:str="", datasets:list[str]=None, document=list[str]=None, offset:int=1, limit:int=30, similarity_threshold:float=0.2, vector_similarity_weight:float=0.3, top_k:int=1024,rerank_id:str=None,keyword:bool=False,higlight:bool=False) -> list[Chunk]
``` ```
???????
### Parameters ### Parameters
#### question: `str` *Required* #### question: `str` *Required*
The user query or query keywords. Defaults to `""`. The user query or query keywords. Defaults to `""`.
#### datasets: `list[str]`, *Required* #### datasets: `list[str]`, *Required*?????
The datasets to search from. The datasets to search from.
@ -735,7 +761,7 @@ The starting index for the documents to retrieve. Defaults to `0`??????.
#### limit: `int` #### limit: `int`
The maximum number of chunks to retrieve. Defaults to `6`. The maximum number of chunks to retrieve. Defaults to `6`.???????????????
#### Similarity_threshold: `float` #### Similarity_threshold: `float`
@ -749,18 +775,18 @@ The weight of vector cosine similarity. Defaults to `0.3`. If x represents the v
The number of chunks engaged in vector cosine computaton. Defaults to `1024`. The number of chunks engaged in vector cosine computaton. Defaults to `1024`.
#### rerank_id #### rerank_id: `str`
The ID of the rerank model. Defaults to `None`. The ID of the rerank model. Defaults to `None`.
#### keyword #### keyword: `bool`
Indicates whether keyword-based matching is enabled: Indicates whether keyword-based matching is enabled:
- `True`: Enabled. - `True`: Enabled.
- `False`: Disabled. - `False`: Disabled (default).
#### highlight:`bool` #### highlight: `bool`
Specifying whether to enable highlighting of matched terms in the results (True) or not (False). Specifying whether to enable highlighting of matched terms in the results (True) or not (False).
@ -775,16 +801,16 @@ Specifying whether to enable highlighting of matched terms in the results (True)
from ragflow import RAGFlow from ragflow import RAGFlow
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380") rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
ds = rag_object.list_datasets(name="ragflow") dataset = rag_object.list_datasets(name="ragflow")
ds = ds[0] dataset = dataset[0]
name = 'ragflow_test.txt' name = 'ragflow_test.txt'
path = './test_data/ragflow_test.txt' path = './test_data/ragflow_test.txt'
rag_object.create_document(ds, name=name, blob=open(path, "rb").read()) rag_object.create_document(dataset, name=name, blob=open(path, "rb").read())
doc = ds.list_documents(name=name) doc = dataset.list_documents(name=name)
doc = doc[0] doc = doc[0]
ds.async_parse_documents([doc.id]) dataset.async_parse_documents([doc.id])
for c in rag_object.retrieve(question="What's ragflow?", for c in rag_object.retrieve(question="What's ragflow?",
datasets=[ds.id], documents=[doc.id], datasets=[dataset.id], documents=[doc.id],
offset=1, limit=30, similarity_threshold=0.2, offset=1, limit=30, similarity_threshold=0.2,
vector_similarity_weight=0.3, vector_similarity_weight=0.3,
top_k=1024 top_k=1024
@ -818,11 +844,11 @@ Creates a chat assistant.
The following shows the attributes of a `Chat` object: The following shows the attributes of a `Chat` object:
#### name: *Required* #### name: `str`, *Required*????????
The name of the chat assistant. Defaults to `"assistant"`. The name of the chat assistant. Defaults to `"assistant"`.
#### avatar #### avatar: `str`
Base64 encoding of the avatar. Defaults to `""`. Base64 encoding of the avatar. Defaults to `""`.
@ -830,7 +856,7 @@ Base64 encoding of the avatar. Defaults to `""`.
The IDs of the associated datasets. Defaults to `[""]`. The IDs of the associated datasets. Defaults to `[""]`.
#### llm #### llm: `Chat.LLM`
The llm of the created chat. Defaults to `None`. When the value is `None`, a dictionary with the following values will be generated as the default. The llm of the created chat. Defaults to `None`. When the value is `None`, a dictionary with the following values will be generated as the default.
@ -849,7 +875,7 @@ An `LLM` object contains the following attributes:
- `max_token`, `int` - `max_token`, `int`
This sets the maximum length of the models output, measured in the number of tokens (words or pieces of words). Defaults to `512`. This sets the maximum length of the models output, measured in the number of tokens (words or pieces of words). Defaults to `512`.
#### Prompt #### prompt: `Chat.Prompt`
Instructions for the LLM to follow. A `Prompt` object contains the following attributes: Instructions for the LLM to follow. A `Prompt` object contains the following attributes:
@ -876,17 +902,17 @@ Instructions for the LLM to follow. A `Prompt` object contains the following at
```python ```python
from ragflow import RAGFlow from ragflow import RAGFlow
rag = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380") rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
kbs = rag.list_datasets(name="kb_1") datasets = rag_object.list_datasets(name="kb_1")
list_kb=[] dataset_ids = []
for kb in kbs: for dataset in datasets:
list_kb.append(kb.id) dataset_ids.append(dataset.id)
assi = rag.create_chat("Miss R", knowledgebases=list_kb) assistant = rag_object.create_chat("Miss R", knowledgebases=dataset_ids)
``` ```
--- ---
## Update chat ## Update chat assistant
```python ```python
Chat.update(update_message: dict) Chat.update(update_message: dict)
@ -934,15 +960,15 @@ A dictionary representing the attributes to update, with the following keys:
```python ```python
from ragflow import RAGFlow from ragflow import RAGFlow
rag = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380") rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
knowledge_base = rag.list_datasets(name="kb_1") datasets = rag_object.list_datasets(name="kb_1")
assistant = rag.create_chat("Miss R", knowledgebases=knowledge_base) assistant = rag_object.create_chat("Miss R", knowledgebases=datasets)
assistant.update({"name": "Stefan", "llm": {"temperature": 0.8}, "prompt": {"top_n": 8}}) assistant.update({"name": "Stefan", "llm": {"temperature": 0.8}, "prompt": {"top_n": 8}})
``` ```
--- ---
## Delete chats ## Delete chat assistants
```python ```python
RAGFlow.delete_chats(ids: list[str] = None) RAGFlow.delete_chats(ids: list[str] = None)
@ -966,13 +992,13 @@ The IDs of the chat assistants to delete. Defaults to `None`. If not specified,
```python ```python
from ragflow import RAGFlow from ragflow import RAGFlow
rag = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380") rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
rag.delete_chats(ids=["id_1","id_2"]) rag_object.delete_chats(ids=["id_1","id_2"])
``` ```
--- ---
## List chats ## List chat assistants
```python ```python
RAGFlow.list_chats( RAGFlow.list_chats(
@ -989,29 +1015,32 @@ Retrieves a list of chat assistants.
### Parameters ### Parameters
#### page #### page: `int`
Specifies the page on which the chat assistants will be displayed. Defaults to `1`. Specifies the page on which the chat assistants will be displayed. Defaults to `1`.
#### page_size #### page_size: `int`
The number of chat assistants on each page. Defaults to `1024`. The number of chat assistants on each page. Defaults to `1024`.
#### order_by #### orderby: `str`
The attribute by which the results are sorted. Defaults to `"create_time"`. The attribute by which the results are sorted. Available options:
#### desc - `"create_time"` (default)
- `"update_time"`
#### desc: `bool`
Indicates whether the retrieved chat assistants should be sorted in descending order. Defaults to `True`. Indicates whether the retrieved chat assistants should be sorted in descending order. Defaults to `True`.
#### id: `string` #### id: `str`
The ID of the chat to retrieve. Defaults to `None`. The ID of the chat assistant to retrieve. Defaults to `None`.
#### name: `string` #### name: `str`
The name of the chat to retrieve. Defaults to `None`. The name of the chat assistant to retrieve. Defaults to `None`.
### Returns ### Returns
@ -1031,7 +1060,7 @@ for assistant in rag_object.list_chats():
--- ---
:::tip API GROUPING :::tip API GROUPING
Chat-session APIs Chat Session APIs
::: :::
--- ---
@ -1046,7 +1075,7 @@ Creates a chat session.
### Parameters ### Parameters
#### name #### name: `str`
The name of the chat session to create. The name of the chat session to create.
@ -1064,12 +1093,14 @@ The name of the chat session to create.
```python ```python
from ragflow import RAGFlow from ragflow import RAGFlow
rag = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380") rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
assistant = rag.list_chats(name="Miss R") assistant = rag_object.list_chats(name="Miss R")
assistant = assistant[0] assistant = assistant[0]
session = assistant.create_session() session = assistant.create_session()
``` ```
---
## Update session ## Update session
```python ```python
@ -1096,8 +1127,8 @@ A dictionary representing the attributes to update, with only one key:
```python ```python
from ragflow import RAGFlow from ragflow import RAGFlow
rag = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380") rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
assistant = rag.list_chats(name="Miss R") assistant = rag_object.list_chats(name="Miss R")
assistant = assistant[0] assistant = assistant[0]
session = assistant.create_session("session_name") session = assistant.create_session("session_name")
session.update({"name": "updated_name"}) session.update({"name": "updated_name"})
@ -1105,6 +1136,101 @@ session.update({"name": "updated_name"})
--- ---
## List sessions
```python
Chat.list_sessions(
page: int = 1,
page_size: int = 1024,
orderby: str = "create_time",
desc: bool = True,
id: str = None,
name: str = None
) -> list[Session]
```
Lists sessions associated with the current chat assistant.
### Parameters
#### page: `int`
Specifies the page on which the sessions will be displayed. Defaults to `1`.
#### page_size: `int`
The number of sessions on each page. Defaults to `1024`.
#### orderby: `str`
The field by which sessions should be sorted. Available options:
- `"create_time"` (default)
- `"update_time"`
#### desc: `bool`
Indicates whether the retrieved sessions should be sorted in descending order. Defaults to `True`.
#### id: `str`
The ID of the chat session to retrieve. Defaults to `None`.
#### name: `str`
The name of the chat session to retrieve. Defaults to `None`.
### Returns
- Success: A list of `Session` objects associated with the current chat assistant.
- Failure: `Exception`.
### Examples
```python
from ragflow import RAGFlow
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
assistant = rag_object.list_chats(name="Miss R")
assistant = assistant[0]
for session in assistant.list_sessions():
print(session)
```
---
## Delete sessions
```python
Chat.delete_sessions(ids:list[str] = None)
```
Deletes sessions by ID.
### Parameters
#### ids: `list[str]`
The IDs of the sessions to delete. Defaults to `None`. If not specified, all sessions associated with the current chat assistant will be deleted.
### Returns
- Success: No value is returned.
- Failure: `Exception`
### Examples
```python
from ragflow import RAGFlow
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
assistant = rag_object.list_chats(name="Miss R")
assistant = assistant[0]
assistant.delete_sessions(ids=["id_1","id_2"])
```
---
## Chat ## Chat
```python ```python
@ -1115,11 +1241,11 @@ Asks a question to start a conversation.
### Parameters ### Parameters
#### question *Required* #### question: `str` *Required*
The question to start an AI chat. Defaults to `None`. The question to start an AI chat.
#### stream #### stream: `str`
Indicates whether to output responses in a streaming way: Indicates whether to output responses in a streaming way:
@ -1172,8 +1298,8 @@ A list of `Chunk` objects representing references to the message, each containin
```python ```python
from ragflow import RAGFlow from ragflow import RAGFlow
rag = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380") rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
assistant = rag.list_chats(name="Miss R") assistant = rag_object.list_chats(name="Miss R")
assistant = assistant[0] assistant = assistant[0]
session = assistant.create_session() session = assistant.create_session()
@ -1189,98 +1315,3 @@ while True:
print(answer.content[len(cont):], end='', flush=True) print(answer.content[len(cont):], end='', flush=True)
cont = answer.content cont = answer.content
``` ```
---
## List sessions
```python
Chat.list_sessions(
page: int = 1,
page_size: int = 1024,
orderby: str = "create_time",
desc: bool = True,
id: str = None,
name: str = None
) -> list[Session]
```
Lists sessions associated with the current chat assistant.
### Parameters
#### page
Specifies the page on which the sessions will be displayed. Defaults to `1`.
#### page_size
The number of sessions on each page. Defaults to `1024`.
#### orderby
The field by which sessions should be sorted. Available options:
- `"create_time"` (default)
- `"update_time"`
#### desc
Indicates whether the retrieved sessions should be sorted in descending order. Defaults to `True`.
#### id
The ID of the chat session to retrieve. Defaults to `None`.
#### name
The name of the chat to retrieve. Defaults to `None`.
### Returns
- Success: A list of `Session` objects associated with the current chat assistant.
- Failure: `Exception`.
### Examples
```python
from ragflow import RAGFlow
rag_object = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
assistant = rag_object.list_chats(name="Miss R")
assistant = assistant[0]
for session in assistant.list_sessions():
print(session)
```
---
## Delete sessions
```python
Chat.delete_sessions(ids:list[str] = None)
```
Deletes sessions by ID.
### Parameters
#### ids: `list[str]`
The IDs of the sessions to delete. Defaults to `None`. If not specified, all sessions associated with the current chat assistant will be deleted.
### Returns
- Success: No value is returned.
- Failure: `Exception`
### Examples
```python
from ragflow import RAGFlow
rag = RAGFlow(api_key="<YOUR_API_KEY>", base_url="http://<YOUR_BASE_URL>:9380")
assistant = rag.list_chats(name="Miss R")
assistant = assistant[0]
assistant.delete_sessions(ids=["id_1","id_2"])
```