Added a graphrag guide (#4978)

### What problem does this PR solve?



### Type of change


- [x] Documentation Update
This commit is contained in:
writinwaters 2025-02-18 13:42:06 +08:00 committed by GitHub
parent d6ba4bd255
commit 3e0bc9e36b
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
8 changed files with 86 additions and 9 deletions

View File

@ -173,7 +173,7 @@ releases! 🌟
3. Start up the server using the pre-built Docker images:
> The command below downloads the `v0.16.0-slim` edition of the RAGFlow Docker image. Refer to the following table for descriptions of different RAGFlow editions. To download an RAGFlow edition different from `v0.16.0-slim`, update the `RAGFLOW_IMAGE` variable accordingly in **docker/.env** before using `docker compose` to start the server. For example: set `RAGFLOW_IMAGE=infiniflow/ragflow:v0.16.0` for the full edition `v0.16.0`.
> The command below downloads the `v0.16.0-slim` edition of the RAGFlow Docker image. Refer to the following table for descriptions of different RAGFlow editions. To download a RAGFlow edition different from `v0.16.0-slim`, update the `RAGFLOW_IMAGE` variable accordingly in **docker/.env** before using `docker compose` to start the server. For example: set `RAGFLOW_IMAGE=infiniflow/ragflow:v0.16.0` for the full edition `v0.16.0`.
```bash
$ cd ragflow

View File

@ -1,5 +1,5 @@
# The RAGFlow team do not actively maintain docker-compose-gpu.yml, so use them at your own risk.
# However, you are welcome to file a pull request to improve it.
# Pull requests to improve it are welcome.
include:
- ./docker-compose-base.yml

View File

@ -15,7 +15,7 @@ Please note that some of your settings may consume a significant amount of time.
## 1. Accelerate document indexing
- Use GPU to reduce embedding time.
- On the configuration page of your knowledge base, toggle off **Use RAPTOR to enhance retrieval**.
- On the configuration page of your knowledge base, switch off **Use RAPTOR to enhance retrieval**.
- The **Knowledge Graph** chunk method (GraphRAG) is time-consuming.
- Disable **Auto-keyword** and **Auto-question** on the configuration page of yor knowledge base, as both depend on the LLM.

View File

@ -0,0 +1,76 @@
---
sidebar_position: 2
slug: /construct_knowledge_graph
---
# Construct knowledge graph
Generate a knowledge graph for your knowledge base.
---
To enhance multi-hop question-answering, RAGFlow adds a knowledge graph construction step between data extraction and indexing, as illustrated below. This step creates additional chunks from existing ones generated by your specified chunk method.
![Image](https://github.com/user-attachments/assets/edf0528d-cb46-46fc-aef4-edb98996949b)
As of v0.16.0, RAGFlow supports constructing a knowledge graph on a knowledge base, allowing you to construct a *unified* graph across multiple files within your knowledge base. When a newly uploaded file starts parsing, the generated graph will automatically update.
:::danger WARNING
Constructing a knowledge graph requires significant memory, computational resources, and tokens.
:::
## Scenarios
Knowledge graphs are especially useful for multi-hop question-answering involving *nested* logic. They outperform traditional extraction approaches when you are performing question answering on books or works with complex entities and relationships.
## Prerequisites
The system's default chat model is used to generate knowledge graph. Before proceeding, ensure that you have an chat model properly configured:
![Image](https://github.com/user-attachments/assets/6bc34279-68c3-4d99-8d20-b7bd1dafc1c1)
## Configurations
### Entity types (*Required*)
The types of the entities to extract from your knowledge base. The default types are: **organization**, **person**, **event**, and **category**. Add or remove types to suit your specific knowledge base.
### Method
The method to use to construct knowledge graph:
- **General**: Use prompts provided by [GraphRAG](https://github.com/microsoft/graphrag) to extract entities and relationships.
- **Light**: (Default) Use prompts provided by [LightRAG](https://github.com/HKUDS/LightRAG) to extract entities and relationships. This option consumes fewer tokens, less memory, and fewer computational resources.
### Entity resolution
Whether to enable entity resolution. You can think of this as an entity deduplication switch. When enabled, the LLM will combine similar entities - e.g., '2025' and 'the year of 2025', or 'IT' and 'Information Technology' - to construct a more accurate graph.
- (Default) Disable entity resolution. This option consumes fewer tokens.
- Enable entity resolution.
### Community report generation
In a knowledge graph, a community is a cluster of entities linked by relationships. You can have the LLM generate an abstract for each community, known as a community report. See [here](https://www.microsoft.com/en-us/research/blog/graphrag-improving-global-search-via-dynamic-community-selection/) for more information. This indicates whether to generate community reports:
- Generate community reports.
- (Default) Do not generate community reports. This options consumes fewer tokens.
## Procedure
1. On the **Configuration** page of your knowledge base, switch on **Extract knowledge graph** or adjust its settings as needed, and click **Save** to confirm your changes.
- *The default GraphRAG configurations for your knowlege base are now set and files uploaded from this point onward will automatically use these settings during parsing.*
- *Files parsed before this update will retain their original knowledge graph settings.*
2. The knowledge graph of your knowlege base does *not* automatically update *until* a newly uploaded file is parsed.
_A **Knowledge Graph** entry appears under **Configuration** once a knowledge graph is created._
3. Click **Knowledge Graph** to view the details of the generated graph.
## Frequently asked questions
### Can I have different knowledge graph settings for different files in my knowledge base?
Yes, you can. Just one graph is generated per knowledge base. The smaller graphs of your files will be *combined* into one big, unified graph at the end of the graph extraction process.

View File

@ -1,5 +1,5 @@
---
sidebar_position: 0
sidebar_position: 1
slug: /set_metada
---
@ -9,13 +9,13 @@ Add metadata to an uploaded file
---
On the **Dataset** page of your knowledge base, you can add metadata to any uploaded file. This approach enables you to 'tag' additional information like URL, author, and date, to an existing file or dataset. In an AI-powered chat, such information will be sent to the LLM with the retrieved chunks for content generation.
On the **Dataset** page of your knowledge base, you can add metadata to any uploaded file. This approach enables you to 'tag' additional information like URL, author, date, and more to an existing file or dataset. In an AI-powered chat, such information will be sent to the LLM with the retrieved chunks for content generation.
For example, if you have a dataset of HTML files and want the LLM to cite the source URL when responding to your query, add a `"url"` parameter to each file's metadata.
![Image](https://github.com/user-attachments/assets/78cb5035-e96c-43f9-82d7-8fef1b68c843)
:::note TIP
:::tip NOTE
Ensure that your metadata is in JSON format; otherwise, your updates will not be applied.
:::

View File

@ -185,7 +185,7 @@ This section provides instructions on setting up the RAGFlow server on Linux. If
3. Use the pre-built Docker images and start up the server:
:::tip NOTE
The command below downloads the `v0.16.0-slim` edition of the RAGFlow Docker image. Refer to the following table for descriptions of different RAGFlow editions. To download an RAGFlow edition different from `v0.15.1-slim`, update the `RAGFLOW_IMAGE` variable accordingly in **docker/.env** before using `docker compose` to start the server. For example: set `RAGFLOW_IMAGE=infiniflow/ragflow:v0.15.1` for the full edition `v0.15.1`.
The command below downloads the `v0.16.0-slim` edition of the RAGFlow Docker image. Refer to the following table for descriptions of different RAGFlow editions. To download a RAGFlow edition different from `v0.16.0-slim`, update the `RAGFLOW_IMAGE` variable accordingly in **docker/.env** before using `docker compose` to start the server. For example: set `RAGFLOW_IMAGE=infiniflow/ragflow:v0.16.0` for the full edition `v0.16.0`.
:::
```bash

View File

@ -87,7 +87,7 @@ Yes, we support enhancing user queries based on existing context of an ongoing c
1. On the **Chat** page, hover over the desired assistant and select **Edit**.
2. In the **Chat Configuration** popup, click the **Prompt Engine** tab.
3. Toggle on **Multi-turn optimization** to enable this feature.
3. Switch on **Multi-turn optimization** to enable this feature.
---

View File

@ -19,6 +19,7 @@ Released on February 6, 2025.
- New UI language: Portuguese.
- Allows setting metadata for a specific file in a knowledge base to support AI-powered chats.
- Upgrades RAGFlow's document engine [Infinity](https://github.com/infiniflow/infinity) to v0.6.0.dev3.
- Supports GPU acceleration for DeepDoc (see [docker-compose-gpu.yml](https://github.com/infiniflow/ragflow/blob/main/docker/docker-compose-gpu.yml)).
- Supports creating and referencing a **Tag** knowledge base as a key milestone towards bridging the semantic gap between query and response.
:::danger IMPORTANT
@ -96,7 +97,7 @@ Released on December 18, 2024.
### Improvements
- Upgrades the Document Layout Analysis model in Deepdoc.
- Upgrades the Document Layout Analysis model in DeepDoc.
- Significantly enhances the retrieval performance when using [Infinity](https://github.com/infiniflow/infinity) as document engine.
### Related APIs