mirror of
https://git.mirrors.martin98.com/https://github.com/infiniflow/ragflow.git
synced 2025-04-29 15:26:04 +08:00

### What problem does this PR solve? Updated UI descriptions for delimiters and recommended chunk size ### Type of change - [x] Documentation Update
47 lines
2.9 KiB
Plaintext
47 lines
2.9 KiB
Plaintext
---
|
|
sidebar_position: 2
|
|
slug: /accelerate_question_answering
|
|
---
|
|
|
|
# Accelerate answering
|
|
import APITable from '@site/src/components/APITable';
|
|
|
|
A checklist to speed up question answering.
|
|
|
|
---
|
|
|
|
Please note that some of your settings may consume a significant amount of time. If you often find that your question answering is time-consuming, here is a checklist to consider:
|
|
|
|
- In the **Prompt Engine** tab of your **Chat Configuration** dialogue, disabling **Multi-turn optimization** will reduce the time required to get an answer from the LLM.
|
|
- In the **Prompt Engine** tab of your **Chat Configuration** dialogue, leaving the **Rerank model** field empty will significantly decrease retrieval time.
|
|
- When using a rerank model, ensure you have a GPU for acceleration; otherwise, the reranking process will be *prohibitively* slow.
|
|
|
|
:::tip NOTE
|
|
Please note that rerank models are essential in certain scenarios. There is always a trade-off between speed and performance; you must weigh the pros against cons for your specific case.
|
|
:::
|
|
|
|
- In the **Assistant Setting** tab of your **Chat Configuration** dialogue, disabling **Keyword analysis** will reduce the time to receive an answer from the LLM.
|
|
- When chatting with your chat assistant, click the light bulb icon above the *current* dialogue and scroll down the popup window to view the time taken for each task:
|
|

|
|
|
|
|
|
```mdx-code-block
|
|
<APITable>
|
|
```
|
|
|
|
| Item name | Description |
|
|
| ----------------- | --------------------------------------------------------------------------------------------- |
|
|
| Total | Total time spent on this conversation round, including chunk retrieval and answer generation. |
|
|
| Check LLM | Time to validate the specified LLM. |
|
|
| Create retriever | Time to create a chunk retriever. |
|
|
| Bind embedding | Time to initialize an embedding model instance. |
|
|
| Bind LLM | Time to initialize an LLM instance. |
|
|
| Tune question | Time to optimize the user query using the context of the mult-turn conversation. |
|
|
| Bind reranker | Time to initialize an reranker model instance for chunk retrieval. |
|
|
| Generate keywords | Time to extract keywords from the user query. |
|
|
| Retrieval | Time to retrieve the chunks. |
|
|
| Generate answer | Time to generate the answer. |
|
|
|
|
```mdx-code-block
|
|
</APITable>
|
|
``` |