Miscellaneous updates (#5670)

### What problem does this PR solve? #5625 #5614 ### Type of change - [x] Documentation Update
2025-08-14 09:45:58 +08:00 · 2025-03-06 09:55:27 +08:00 · 2025-03-06 09:55:27 +08:00 · 5f62f0c9d7
commit 5f62f0c9d7
parent a54843cc65
8 changed files with 29 additions and 23 deletions
--- a/docs/develop/launch_ragflow_from_source.md
+++ b/docs/develop/launch_ragflow_from_source.md
@ -3,7 +3,7 @@ sidebar_position: 2
 slug: /launch_ragflow_from_source
 ---

-# Launch RAGFlow service from source
+# Launch service from source

 A guide explaining how to set up a RAGFlow service from its source code. By following this guide, you'll be able to debug using the source code.

--- a/docs/guides/agent/embed_agent_into_webpage.md
+++ b/docs/guides/agent/embed_agent_into_webpage.md
@ -7,6 +7,10 @@ slug: /embed_agent_into_webpage

 You can use iframe to embed an agent into a third-party webpage.

+:::caution WARNING
+If your agent's **Begin** component takes a key of **file** type (a **file** type variable), you *cannot* embed it into a webpage.
+:::
+
 1. Before proceeding, you must [acquire an API key](../models/llm_api_key_setup.md); otherwise, an error message would appear.
 2. On the **Agent** page, click an intended agent **>** **Edit** to access its editing page.
 3. Click **Embed into webpage** on the top right corner of the canvas to show the **iframe** window:
--- a/docs/guides/dataset/accelerate_doc_indexing.mdx
+++ b/docs/guides/dataset/accelerate_doc_indexing.mdx
@ -15,4 +15,5 @@ Please note that some of your settings may consume a significant amount of time.
 - Use GPU to reduce embedding time.
 - On the configuration page of your knowledge base, switch off **Use RAPTOR to enhance retrieval**.
 - Extracting knowledge graph (GraphRAG) is time-consuming.
- Disable **Auto-keyword** and **Auto-question** on the configuration page of yor knowledge base, as both depend on the LLM.
+- Disable **Auto-keyword** and **Auto-question** on the configuration page of yor knowledge base, as both depend on the LLM.
+- **v0.17.0:** If your document is plain text PDF and does not require GPU-intensive processes like OCR (Optical Character Recognition), TSR (Table Structure Recognition), or DLA (Document Layout Analysis), you can choose **Naive** over **DeepDoc** or other time-consuming large model options in the **Document parser** dropdown. This will substantially reduce document parsing time.
--- a/docs/guides/manage_team_members.md
+++ b/docs/guides/manage_team_members.md
@ -11,8 +11,9 @@ Invite or remove team members, join or leave a team.

 By default, each RAGFlow user is assigned a single team named after their name. RAGFlow allows you to invite RAGFlow users to your team. Your team members can help you:

- Upload documents to your datasets.
+- Upload documents to your datasets (knowledge bases).
 - Update document configurations in your datasets.
+- Update the default configurations for your datasets.
 - Parse documents in your datasets.

 :::tip NOTE
--- a/docs/release_notes.md
+++ b/docs/release_notes.md
@ -13,15 +13,15 @@ Released on March 3, 2025.

 ### New features

-1. AI chat: Implements Deep Research for agentic reasoning. To activate this, enable the **Reasoning** toggle under the **Prompt Engine** tab of your chat assistant dialogue.
-2. AI chat: Leverages Tavily-based web search to enhance contexts in agentic reasoning. To activate this, enter the correct Tavily API key under the **Assistant Setting** tab of your chat assistant dialogue.
-3. AI chat: Supports initiating a chat without specifying knowledge bases.
-4. AI chat: HTML files can also be previewed and referenced, in addition to PDF files.
-5. Dataset: Adds a **Layout recognition & OCR** dropdown menu to dataset configurations. This includes a DeepDoc model option, which is time-consuming, a much faster **naive** option (plain text), which skips DLR (Document Layout Recognition), OCR (Optimal Character Recognition), and TSR (Table Structure Recognition) tasks, and several currently *experimental* large model options.
-6. Agent component: **(x)** or a forward slash `/` can be used to insert available keys (variables) in the system prompt field of the **Generate** or **Template** component.
-7. Object storage: Supports using Aliyun OSS (Object Storage Service) as a file storage option.
-8. Models: Updates the supported model list for Tongyi-Qianwen, adding DeepSeek-specific models; adds ModelScope as a model provider.
-9. APIs: Document metadata can be updated through an API.
+- AI chat: Implements Deep Research for agentic reasoning. To activate this, enable the **Reasoning** toggle under the **Prompt Engine** tab of your chat assistant dialogue.
+- AI chat: Leverages Tavily-based web search to enhance contexts in agentic reasoning. To activate this, enter the correct Tavily API key under the **Assistant Setting** tab of your chat assistant dialogue.
+- AI chat: Supports starting a chat without specifying knowledge bases.
+- AI chat: HTML files can also be previewed and referenced, in addition to PDF files.
+- Dataset: Adds a **Document parser** dropdown menu to dataset configurations. This includes a DeepDoc model option, which is time-consuming, a much faster **naive** option (plain text), which skips DLA (Document Layout Analysis), OCR (Optical Character Recognition), and TSR (Table Structure Recognition) tasks, and several currently *experimental* large model options.
+- Agent component: **(x)** or a forward slash `/` can be used to insert available keys (variables) in the system prompt field of the **Generate** or **Template** component.
+- Object storage: Supports using Aliyun OSS (Object Storage Service) as a file storage option.
+- Models: Updates the supported model list for Tongyi-Qianwen, adding DeepSeek-specific models; adds ModelScope as a model provider.
+- APIs: Document metadata can be updated through an API.

 The following diagram illustrates the workflow of RAGFlow's Deep Research:

--- a/web/src/locales/en.ts
+++ b/web/src/locales/en.ts
@ -140,7 +140,7 @@ export default {
      toMessage: 'Missing end page number (excluded)',
      layoutRecognize: 'Document parser',
      layoutRecognizeTip:
-        'Use visual models for layout analysis to better understand the structure of the document and effectively locate document titles, text blocks, images, and tables. If disabled, only the plain text in the PDF will be retrieved.',
+        'Use a visual model for PDF layout analysis to effectively locate document titles, text blocks, images, and tables. If the naive option is chosen, only the plain text in the PDF will be retrieved. Please note that this option currently works ONLY for PDF documents.',
      taskPageSize: 'Task page size',
      taskPageSizeMessage: 'Please input your task page size!',
      taskPageSizeTip: `During layout recognition, a PDF file is split into chunks and processed in parallel to increase processing speed. This parameter sets the size of each chunk. A larger chunk size reduces the likelihood of splitting continuous text between pages.`,
--- a/web/src/locales/zh-traditional.ts
+++ b/web/src/locales/zh-traditional.ts
@ -98,7 +98,7 @@ export default {
      webCrawl: '網頁抓取',
      chunkNumber: '分塊數',
      uploadDate: '上傳日期',
-      chunkMethod: '解析方法',
+      chunkMethod: '切片方法',
      enabled: '啟用',
      disabled: '禁用',
      action: '動作',
@ -138,7 +138,7 @@ export default {
      toMessage: '缺少結束頁碼（不包含）',
      layoutRecognize: '文件解析器',
      layoutRecognizeTip:
-        '使用視覺模型進行佈局分析，以更好地識別文檔結構，找到標題、文本塊、圖像和表格的位置。如果沒有此功能，則只能獲取 PDF 的純文本。',
+        '使用視覺模型進行 PDF 布局分析，以更好地識別文檔結構，找到標題、文字塊、圖像和表格的位置。若選擇 Naive 選項，則只能取得 PDF 的純文字。請注意此功能僅適用於 PDF 文檔，對其他文檔不生效。',
      taskPageSize: '任務頁面大小',
      taskPageSizeMessage: '請輸入您的任務頁面大小！',
      taskPageSizeTip: `如果使用佈局識別，PDF 文件將被分成連續的組。佈局分析將在組之間並行執行，以提高處理速度。“任務頁面大小”決定組的大小。頁面大小越大，將頁面之間的連續文本分割成不同塊的機會就越低。`,
@ -192,10 +192,10 @@ export default {
      metaData: '元資料',
      deleteDocumentConfirmContent:
        '該文件與知識圖譜相關聯。刪除後，相關節點和關係資訊將被刪除，但圖不會立即更新。更新圖動作是在解析承載知識圖譜提取任務的新文件的過程中執行的。 ',
-      plainText: '簡易',
+      plainText: 'Naive',
    },
    knowledgeConfiguration: {
-      titleDescription: '在這裡更新您的知識庫詳細信息，尤其是解析方法。',
+      titleDescription: '在這裡更新您的知識庫詳細信息，尤其是切片方法。',
      name: '知識庫名稱',
      photo: '知識庫圖片',
      description: '描述',
@ -210,7 +210,7 @@ export default {
        '用於嵌入塊的嵌入模型。一旦知識庫有了塊，它就無法更改。如果你想改變它，你需要刪除所有的塊。',
      permissionsTip: '如果權限是“團隊”，則所有團隊成員都可以操作知識庫。',
      chunkTokenNumberTip: '它大致確定了一個塊的Token數量。',
-      chunkMethod: '解析方法',
+      chunkMethod: '切片方法',
      chunkMethodTip: '說明位於右側。',
      upload: '上傳',
      english: '英語',
--- a/web/src/locales/zh.ts
+++ b/web/src/locales/zh.ts
@ -98,7 +98,7 @@ export default {
      webCrawl: '网页抓取',
      chunkNumber: '分块数',
      uploadDate: '上传日期',
-      chunkMethod: '解析方法',
+      chunkMethod: '切片方法',
      enabled: '启用',
      disabled: '禁用',
      action: '动作',
@ -138,7 +138,7 @@ export default {
      toMessage: '缺少结束页码（不包含）',
      layoutRecognize: '文档解析器',
      layoutRecognizeTip:
-        '使用视觉模型进行布局分析，以更好地识别文档结构，找到标题、文本块、图像和表格的位置。 如果没有此功能，则只能获取 PDF 的纯文本。',
+        '使用视觉模型进行 PDF 布局分析，以更好地识别文档结构，找到标题、文本块、图像和表格的位置。 如果选择 Naive 选项，则只能获取 PDF 的纯文本。请注意该功能只适用于 PDF 文档，对其他文档不生效。',
      taskPageSize: '任务页面大小',
      taskPageSizeMessage: '请输入您的任务页面大小！',
      taskPageSizeTip: `如果使用布局识别，PDF 文件将被分成连续的组。 布局分析将在组之间并行执行，以提高处理速度。 “任务页面大小”决定组的大小。 页面大小越大，将页面之间的连续文本分割成不同块的机会就越低。`,
@ -192,10 +192,10 @@ export default {
      metaData: '元数据',
      deleteDocumentConfirmContent:
        '该文档与知识图谱相关联。删除后，相关节点和关系信息将被删除，但图不会立即更新。更新图动作是在解析承载知识图谱提取任务的新文档的过程中执行的。',
-      plainText: '简易',
+      plainText: 'Naive',
    },
    knowledgeConfiguration: {
-      titleDescription: '在这里更新您的知识库详细信息，尤其是解析方法。',
+      titleDescription: '在这里更新您的知识库详细信息，尤其是切片方法。',
      name: '知识库名称',
      photo: '知识库图片',
      description: '描述',
@ -210,7 +210,7 @@ export default {
        '用于嵌入块的嵌入模型。 一旦知识库有了块，它就无法更改。 如果你想改变它，你需要删除所有的块。',
      permissionsTip: '如果权限是“团队”，则所有团队成员都可以操作知识库。',
      chunkTokenNumberTip: '它大致确定了一个块的Token数量。',
-      chunkMethod: '解析方法',
+      chunkMethod: '切片方法',
      chunkMethodTip: '说明位于右侧。',
      upload: '上传',
      english: '英文',