From 3cffadc7a2ceb2cd33da329c36e05c275189e19f Mon Sep 17 00:00:00 2001 From: writinwaters <93570324+writinwaters@users.noreply.github.com> Date: Tue, 18 Feb 2025 19:29:40 +0800 Subject: [PATCH] Added an FAQ (#5092) ### What problem does this PR solve? ### Type of change - [x] Documentation Update --- docs/guides/accelerate_question_answering.mdx | 2 +- .../construct_knowledge_graph.md | 12 ++++---- docs/references/faq.md | 29 ++++++++++++++++++ docs/references/http_api_reference.md | 30 +++++++++++-------- docs/references/python_api_reference.md | 2 +- docs/release_notes.md | 2 +- web/src/locales/en.ts | 10 +++---- 7 files changed, 61 insertions(+), 26 deletions(-) diff --git a/docs/guides/accelerate_question_answering.mdx b/docs/guides/accelerate_question_answering.mdx index 8edbd60ae..5daf604bb 100644 --- a/docs/guides/accelerate_question_answering.mdx +++ b/docs/guides/accelerate_question_answering.mdx @@ -16,7 +16,7 @@ Please note that some of your settings may consume a significant amount of time. - Use GPU to reduce embedding time. - On the configuration page of your knowledge base, switch off **Use RAPTOR to enhance retrieval**. -- The **Knowledge Graph** chunk method (GraphRAG) is time-consuming. +- Extracting knowledge graph (GraphRAG) is time-consuming. - Disable **Auto-keyword** and **Auto-question** on the configuration page of yor knowledge base, as both depend on the LLM. ## 2. Accelerate question answering diff --git a/docs/guides/configure_knowledge_base/construct_knowledge_graph.md b/docs/guides/configure_knowledge_base/construct_knowledge_graph.md index a7ba93b31..f720a9e44 100644 --- a/docs/guides/configure_knowledge_base/construct_knowledge_graph.md +++ b/docs/guides/configure_knowledge_base/construct_knowledge_graph.md @@ -44,23 +44,23 @@ The method to use to construct knowledge graph: ### Entity resolution -Whether to enable entity resolution. You can think of this as an entity deduplication switch. When enabled, the LLM will combine similar entities - e.g., '2025' and 'the year of 2025', or 'IT' and 'Information Technology' - to construct a more accurate graph. +Whether to enable entity resolution. You can think of this as an entity deduplication switch. When enabled, the LLM will combine similar entities - e.g., '2025' and 'the year of 2025', or 'IT' and 'Information Technology' - to construct a more effective graph. -- (Default) Disable entity resolution. This option consumes fewer tokens. -- Enable entity resolution. +- (Default) Disable entity resolution. +- Enable entity resolution. This option consumes more tokens. ### Community report generation In a knowledge graph, a community is a cluster of entities linked by relationships. You can have the LLM generate an abstract for each community, known as a community report. See [here](https://www.microsoft.com/en-us/research/blog/graphrag-improving-global-search-via-dynamic-community-selection/) for more information. This indicates whether to generate community reports: -- Generate community reports. -- (Default) Do not generate community reports. This options consumes fewer tokens. +- Generate community reports. This option consumes more tokens. +- (Default) Do not generate community reports. ## Procedure 1. On the **Configuration** page of your knowledge base, switch on **Extract knowledge graph** or adjust its settings as needed, and click **Save** to confirm your changes. - - *The default GraphRAG configurations for your knowlege base are now set and files uploaded from this point onward will automatically use these settings during parsing.* + - *The default knowledge graph configurations for your knowlege base are now set and files uploaded from this point onward will automatically use these settings during parsing.* - *Files parsed before this update will retain their original knowledge graph settings.* 2. The knowledge graph of your knowlege base does *not* automatically update *until* a newly uploaded file is parsed. diff --git a/docs/references/faq.md b/docs/references/faq.md index 219666b36..3a7ae11c2 100644 --- a/docs/references/faq.md +++ b/docs/references/faq.md @@ -22,6 +22,35 @@ The "garbage in garbage out" status quo remains unchanged despite the fact that --- +### Where to find the version of RAGFlow? How to interprete it? + +You can find the RAGFlow version number on the **System** page of the UI: + +![Image](https://github.com/user-attachments/assets/20cf7213-2537-4e18-a88c-4dadf6228c6b) + +If you build RAGFlow from source, the version number is also in the system log: + +``` + ____ ___ ______ ______ __ + / __ \ / | / ____// ____// /____ _ __ + / /_/ // /| | / / __ / /_ / // __ \| | /| / / + / _, _// ___ |/ /_/ // __/ / // /_/ /| |/ |/ / + /_/ |_|/_/ |_|\____//_/ /_/ \____/ |__/|__/ + +2025-02-18 10:10:43,835 INFO 1445658 RAGFlow version: v0.16.0-50-g6daae7f2 full +``` + +Where: + +- `v0.16.0`: The officially published release. +- `50`: The number of git commits since the official release. +- `g6daae7f2`: `g` is the prefix, and `6daae7f2` is the first seven characters of the current commit ID. +- `full`/`slim`: The RAGFlow edition. + - `full`: The full RAGFlow edition. + - `slim`: The RAGFlow edition without embedding models and Python packages. + +--- + ### Why does it take longer for RAGFlow to parse a document than LangChain? We put painstaking effort into document pre-processing tasks like layout analysis, table structure recognition, and OCR (Optical Character Recognition) using our vision models. This contributes to the additional time required. diff --git a/docs/references/http_api_reference.md b/docs/references/http_api_reference.md index 36aad7bb9..35cade49b 100644 --- a/docs/references/http_api_reference.md +++ b/docs/references/http_api_reference.md @@ -2178,10 +2178,12 @@ Creates a session with an agent. - Body: - the required parameters:`str` - other parameters: - The parameters in the begin component. + The parameters set in the **Begin** component. ##### Request example -If `begin` component in the agent doesn't have required parameters: + +If the **Begin** component in your agent does not have required parameters: + ```bash curl --request POST \ --url http://{address}/api/v1/agents/{agent_id}/sessions \ @@ -2190,7 +2192,9 @@ curl --request POST \ --data '{ }' ``` -If `begin` component in the agent has required parameters: + +If the **Begin** component in your agent has required parameters: + ```bash curl --request POST \ --url http://{address}/api/v1/agents/{agent_id}/sessions \ @@ -2201,7 +2205,9 @@ curl --request POST \ "file":"Who are you" }' ``` -If `begin` component in the agent has required file parameters: + +If the **Begin** component in your agent has required file parameters: + ```bash curl --request POST \ --url http://{address}/api/v1/agents/{agent_id}/sessions?user_id={user_id} \ @@ -2215,7 +2221,7 @@ curl --request POST \ - `agent_id`: (*Path parameter*) The ID of the associated agent. - `user_id`: (*Filter parameter*), string - The optional user-defined ID for parsing docs(especially images) when creating session while uploading files. + The optional user-defined ID for parsing docs (especially images) when creating a session while uploading files. #### Response @@ -2367,7 +2373,7 @@ Asks a specified agent a question to start an AI-powered conversation. - `"user_id"`: `string`(optional) - other parameters: `string` ##### Request example -If the `begin` component doesn't have parameters, the following code will create a session. +Ifthe **Begin** component doesn't have parameters, the following code will create a session. ```bash curl --request POST \ --url http://{address}/api/v1/agents/{agent_id}/completions \ @@ -2377,7 +2383,7 @@ curl --request POST \ { }' ``` -If the `begin` component have parameters, the following code will create a session. +Ifthe **Begin** component have parameters, the following code will create a session. ```bash curl --request POST \ --url http://{address}/api/v1/agents/{agent_id}/completions \ @@ -2403,7 +2409,6 @@ curl --request POST \ }' ``` - ##### Request Parameters - `agent_id`: (*Path parameter*), `string` @@ -2419,9 +2424,10 @@ curl --request POST \ - `"user_id"`: (*Body parameter*), `string` The optional user-defined ID. Valid *only* when no `session_id` is provided. - Other parameters: (*Body Parameter*) - The parameters in the begin component. + Parameters specified in the **Begin** component. + #### Response -success without `session_id` provided and with no parameters in the `begin` component: +success without `session_id` provided and with no parameters inthe **Begin** component: ```json data:{ "code": 0, @@ -2439,7 +2445,7 @@ data:{ "data": true } ``` -Success without `session_id` provided and with parameters in the `begin` component: +Success without `session_id` provided and with parameters inthe **Begin** component: ```json data:{ @@ -2475,7 +2481,7 @@ data:{ } data: ``` -Success with parameters in the `begin` component: +Success with parameters inthe **Begin** component: ```json data:{ "code": 0, diff --git a/docs/references/python_api_reference.md b/docs/references/python_api_reference.md index dc24f1cba..f28b39a96 100644 --- a/docs/references/python_api_reference.md +++ b/docs/references/python_api_reference.md @@ -1461,7 +1461,7 @@ In streaming mode, not all responses include a reference, as this depends on the ##### question: `str` -The question to start an AI-powered conversation. If the `begin` component takes parameters, a question is not required. +The question to start an AI-powered conversation. Ifthe **Begin** component takes parameters, a question is not required. ##### stream: `bool` diff --git a/docs/release_notes.md b/docs/release_notes.md index 44a0986c4..87903322f 100644 --- a/docs/release_notes.md +++ b/docs/release_notes.md @@ -14,7 +14,7 @@ Released on February 6, 2025. ### New features - Supports DeepSeek R1 and DeepSeek V3. -- GraphRAG refactor: Knowledge graph is dynamically built on an entire knowledge base (dataset) rather than on an individual file, and automatically updated when files are added or removed. +- GraphRAG refactor: Knowledge graph is dynamically built on an entire knowledge base (dataset) rather than on an individual file, and automatically updated when files are added or removed. See [here](https://ragflow.io/docs/dev/construct_knowledge_graph). - Adds an **Iteration** agent component and a **Research report generator** agent template. - New UI language: Portuguese. - Allows setting metadata for a specific file in a knowledge base to support AI-powered chats. diff --git a/web/src/locales/en.ts b/web/src/locales/en.ts index 9eb8c59da..7e2d7b28a 100644 --- a/web/src/locales/en.ts +++ b/web/src/locales/en.ts @@ -369,15 +369,15 @@ This procedure will improve precision of retrieval by adding more information to addTag: 'Add tag', useGraphRag: 'Extract knowledge graph', useGraphRagTip: - 'After files being chunked, all the chunks will be used for knowlege graph generation which helps inference of multi-hop and complex problems a lot.', + 'Construct a knowledge graph over extracted file chunks to enhance multi-hop question answering.', graphRagMethod: 'Method', - graphRagMethodTip: `Light: the entity and relation extraction prompt is from GitHub - HKUDS/LightRAG: "LightRAG: Simple and Fast Retrieval-Augmented Generation"
- General: the entity and relation extraction prompt is from GitHub - microsoft/graphrag: A modular graph-based Retrieval-Augmented Generation (RAG) system`, + graphRagMethodTip: `Light: (Default) Use prompts provided by github.com/HKUDS/LightRAG to extract entities and relationships. This option consumes fewer tokens, less memory, and fewer computational resources.
+ General: Use prompts provided by github.com/microsoft/graphrag to extract entities and relationships`, resolution: 'Entity resolution', - resolutionTip: `The resolution procedure would merge entities with the same meaning together which allows the graph conciser and more accurate. Entities as following should be merged: President Trump, Donald Trump, Donald J. Trump, Donald John Trump`, + resolutionTip: `An entity deduplication switch. When enabled, the LLM will combine similar entities - e.g., '2025' and 'the year of 2025', or 'IT' and 'Information Technology' - to construct a more accurate graph`, community: 'Community reports generation', communityTip: - 'Chunks are clustered into hierarchical communities with entities and relationships connecting each segment up through higher levels of abstraction. We then use an LLM to generate a summary of each community, known as a community report. More: https://www.microsoft.com/en-us/research/blog/graphrag-improving-global-search-via-dynamic-community-selection/', + 'In a knowledge graph, a community is a cluster of entities linked by relationships. You can have the LLM generate an abstract for each community, known as a community report. See here for more information: https://www.microsoft.com/en-us/research/blog/graphrag-improving-global-search-via-dynamic-community-selection/', }, chunk: { chunk: 'Chunk',