Merge branch 'main' into feat/llm-struct-output

2025-08-20 01:09:05 +08:00 · 2025-03-31 14:43:07 +08:00 · 2025-03-31 14:43:07 +08:00 · 1193ab12fc
commit 1193ab12fc
parent b5f74cbf62 7df36fe9f5
251 changed files with 6179 additions and 2319 deletions
--- a/.github/workflows/build-push.yml
+++ b/.github/workflows/build-push.yml
@ -6,8 +6,8 @@ on:
      - "main"
      - "deploy/dev"
      - "deploy/enterprise"
-  release:
+    tags:
-    types: [published]
+      - "*"
 concurrency:
  group: build-push-${{ github.head_ref || github.run_id }}
--- a/.github/workflows/vdb-tests.yml
+++ b/.github/workflows/vdb-tests.yml
@ -76,7 +76,6 @@ jobs:
            milvus-standalone
            pgvecto-rs
            pgvector
            opengauss
            chroma
            elasticsearch
--- a/.github/workflows/web-tests.yml
+++ b/.github/workflows/web-tests.yml
@ -31,25 +31,24 @@ jobs:
        uses: tj-actions/changed-files@v45
        with:
          files: web/**
-      # to run pnpm, should install package canvas, but it always install failed on amd64 under ubuntu-latest
+      - name: Install pnpm
-      # - name: Install pnpm
+        uses: pnpm/action-setup@v4
-      #   uses: pnpm/action-setup@v4
+        with:
-      #   with:
+          version: 10
-      #     version: 10
+          run_install: false
      #     run_install: false
-      # - name: Setup Node.js
+      - name: Setup Node.js
-      #   uses: actions/setup-node@v4
+        uses: actions/setup-node@v4
-      #   if: steps.changed-files.outputs.any_changed == 'true'
+        if: steps.changed-files.outputs.any_changed == 'true'
-      #   with:
+        with:
-      #     node-version: 20
+          node-version: 20
-      #     cache: pnpm
+          cache: pnpm
-      #     cache-dependency-path: ./web/package.json
+          cache-dependency-path: ./web/package.json
-      # - name: Install dependencies
+      - name: Install dependencies
-      #   if: steps.changed-files.outputs.any_changed == 'true'
+        if: steps.changed-files.outputs.any_changed == 'true'
-      #   run: pnpm install --frozen-lockfile
+        run: pnpm install --frozen-lockfile
-      # - name: Run tests
+      - name: Run tests
-      #   if: steps.changed-files.outputs.any_changed == 'true'
+        if: steps.changed-files.outputs.any_changed == 'true'
-      #   run: pnpm test
+        run: pnpm test
--- a/.gitignore
+++ b/.gitignore
@ -103,6 +103,7 @@ celerybeat.pid
 # Environments
 .env
 .env-local
 .venv
 env/
 venv/
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@ -18,7 +18,7 @@ Need to update an existing model runtime, tool, or squash some bugs? Head over t
 Join the fun, contribute, and let's build something awesome together! 💡✨
-Don't forget to link an existing issue or open an new issue in the PR's description.
+Don't forget to link an existing issue or open a new issue in the PR's description.
 ### Bug reports
@ -68,7 +68,7 @@ How we prioritize:
 4. Please add tests for your changes accordingly
 5. Ensure your code passes the existing tests
 6. Please link the issue in the PR description, `fixes #<issue_number>`
-7. Get merrged! 
+7. Get merged! 
 ### Setup the project
 #### Frontend
@ -90,4 +90,4 @@ We recommend reviewing this document carefully before proceeding with the setup,
 Feel free to reach out if you encounter any issues during the setup process.
 ## Getting Help
-If you ever get stuck or got a burning question while contributing, simply shoot your queries our way via the related GitHub issue, or hop onto our [Discord](https://discord.gg/8Tpq4AcN9c) for a quick chat.
+If you ever get stuck or get a burning question while contributing, simply shoot your queries our way via the related GitHub issue, or hop onto our [Discord](https://discord.gg/8Tpq4AcN9c) for a quick chat.
--- a/CONTRIBUTING_ES.md
+++ b/CONTRIBUTING_ES.md
@ -0,0 +1,93 @@
 # CONTRIBUIR
 Así que estás buscando contribuir a Dify - eso es fantástico, estamos ansiosos por ver lo que haces. Como una startup con personal y financiación limitados, tenemos grandes ambiciones de diseñar el flujo de trabajo más intuitivo para construir y gestionar aplicaciones LLM. Cualquier ayuda de la comunidad cuenta, realmente.
 Necesitamos ser ágiles y enviar rápidamente dado donde estamos, pero también queremos asegurarnos de que colaboradores como tú obtengan una experiencia lo más fluida posible al contribuir. Hemos elaborado esta guía de contribución con ese propósito, con el objetivo de familiarizarte con la base de código y cómo trabajamos con los colaboradores, para que puedas pasar rápidamente a la parte divertida.
 Esta guía, como Dify mismo, es un trabajo en constante progreso. Agradecemos mucho tu comprensión si a veces se queda atrás del proyecto real, y damos la bienvenida a cualquier comentario para que podamos mejorar.
 En términos de licencia, por favor tómate un minuto para leer nuestro breve [Acuerdo de Licencia y Colaborador](./LICENSE). La comunidad también se adhiere al [código de conducta](https://github.com/langgenius/.github/blob/main/CODE_OF_CONDUCT.md).
 ## Antes de empezar
 ¿Buscas algo en lo que trabajar? Explora nuestros [buenos primeros issues](https://github.com/langgenius/dify/issues?q=is%3Aissue%20state%3Aopen%20label%3A%22good%20first%20issue%22) y elige uno para comenzar.
 ¿Tienes un nuevo modelo o herramienta genial para añadir? Abre un PR en nuestro [repositorio de plugins](https://github.com/langgenius/dify-plugins) y muéstranos lo que has construido.
 ¿Necesitas actualizar un modelo existente, herramienta o corregir algunos errores? Dirígete a nuestro [repositorio oficial de plugins](https://github.com/langgenius/dify-official-plugins) y haz tu magia.
 ¡Únete a la diversión, contribuye y construyamos algo increíble juntos! 💡✨
 No olvides vincular un issue existente o abrir uno nuevo en la descripción del PR.
 ### Informes de errores
 > [!IMPORTANT]
 > Por favor, asegúrate de incluir la siguiente información al enviar un informe de error:
 - Un título claro y descriptivo
 - Una descripción detallada del error, incluyendo cualquier mensaje de error
 - Pasos para reproducir el error
 - Comportamiento esperado
 - **Logs**, si están disponibles, para problemas del backend, esto es realmente importante, puedes encontrarlos en los logs de docker-compose
 - Capturas de pantalla o videos, si es aplicable
 Cómo priorizamos:
  | Tipo de Issue                                                | Prioridad       |
  | ------------------------------------------------------------ | --------------- |
  | Errores en funciones principales (servicio en la nube, no poder iniciar sesión, aplicaciones que no funcionan, fallos de seguridad) | Crítica         |
  | Errores no críticos, mejoras de rendimiento                 | Prioridad Media |
  | Correcciones menores (errores tipográficos, UI confusa pero funcional) | Prioridad Baja  |
 ### Solicitudes de funcionalidades
 > [!NOTE]
 > Por favor, asegúrate de incluir la siguiente información al enviar una solicitud de funcionalidad:
 - Un título claro y descriptivo
 - Una descripción detallada de la funcionalidad
 - Un caso de uso para la funcionalidad
 - Cualquier otro contexto o capturas de pantalla sobre la solicitud de funcionalidad
 Cómo priorizamos:
  | Tipo de Funcionalidad                                        | Prioridad       |
  | ------------------------------------------------------------ | --------------- |
  | Funcionalidades de alta prioridad etiquetadas por un miembro del equipo | Prioridad Alta  |
  | Solicitudes populares de funcionalidades de nuestro [tablero de comentarios de la comunidad](https://github.com/langgenius/dify/discussions/categories/feedbacks) | Prioridad Media |
  | Funcionalidades no principales y mejoras menores            | Prioridad Baja  |
  | Valiosas pero no inmediatas                                 | Futura-Funcionalidad |
 ## Enviando tu PR
 ### Proceso de Pull Request
 1. Haz un fork del repositorio
 2. Antes de redactar un PR, por favor crea un issue para discutir los cambios que quieres hacer
 3. Crea una nueva rama para tus cambios
 4. Por favor añade pruebas para tus cambios en consecuencia
 5. Asegúrate de que tu código pasa las pruebas existentes
 6. Por favor vincula el issue en la descripción del PR, `fixes #<número_del_issue>`
 7. ¡Fusiona tu código!
 ### Configuración del proyecto
 #### Frontend
 Para configurar el servicio frontend, por favor consulta nuestra [guía completa](https://github.com/langgenius/dify/blob/main/web/README.md) en el archivo `web/README.md`. Este documento proporciona instrucciones detalladas para ayudarte a configurar el entorno frontend correctamente.
 #### Backend
 Para configurar el servicio backend, por favor consulta nuestras [instrucciones detalladas](https://github.com/langgenius/dify/blob/main/api/README.md) en el archivo `api/README.md`. Este documento contiene una guía paso a paso para ayudarte a poner en marcha el backend sin problemas.
 #### Otras cosas a tener en cuenta
 Recomendamos revisar este documento cuidadosamente antes de proceder con la configuración, ya que contiene información esencial sobre:
 - Requisitos previos y dependencias
 - Pasos de instalación
 - Detalles de configuración
 - Consejos comunes de solución de problemas
 No dudes en contactarnos si encuentras algún problema durante el proceso de configuración.
 ## Obteniendo Ayuda
 Si alguna vez te quedas atascado o tienes una pregunta urgente mientras contribuyes, simplemente envíanos tus consultas a través del issue relacionado de GitHub, o únete a nuestro [Discord](https://discord.gg/8Tpq4AcN9c) para una charla rápida. 
--- a/CONTRIBUTING_FR.md
+++ b/CONTRIBUTING_FR.md
@ -0,0 +1,93 @@
 # CONTRIBUER
 Vous cherchez donc à contribuer à Dify - c'est fantastique, nous avons hâte de voir ce que vous allez faire. En tant que startup avec un personnel et un financement limités, nous avons de grandes ambitions pour concevoir le flux de travail le plus intuitif pour construire et gérer des applications LLM. Toute aide de la communauté compte, vraiment.
 Nous devons être agiles et livrer rapidement compte tenu de notre position, mais nous voulons aussi nous assurer que des contributeurs comme vous obtiennent une expérience aussi fluide que possible lors de leur contribution. Nous avons élaboré ce guide de contribution dans ce but, visant à vous familiariser avec la base de code et comment nous travaillons avec les contributeurs, afin que vous puissiez rapidement passer à la partie amusante.
 Ce guide, comme Dify lui-même, est un travail en constante évolution. Nous apprécions grandement votre compréhension si parfois il est en retard par rapport au projet réel, et nous accueillons tout commentaire pour nous aider à nous améliorer.
 En termes de licence, veuillez prendre une minute pour lire notre bref [Accord de Licence et de Contributeur](./LICENSE). La communauté adhère également au [code de conduite](https://github.com/langgenius/.github/blob/main/CODE_OF_CONDUCT.md).
 ## Avant de vous lancer
 Vous cherchez quelque chose à réaliser ? Parcourez nos [problèmes pour débutants](https://github.com/langgenius/dify/issues?q=is%3Aissue%20state%3Aopen%20label%3A%22good%20first%20issue%22) et choisissez-en un pour commencer !
 Vous avez un nouveau modèle ou un nouvel outil à ajouter ? Ouvrez une PR dans notre [dépôt de plugins](https://github.com/langgenius/dify-plugins) et montrez-nous ce que vous avez créé.
 Vous devez mettre à jour un modèle existant, un outil ou corriger des bugs ? Rendez-vous sur notre [dépôt officiel de plugins](https://github.com/langgenius/dify-official-plugins) et faites votre magie !
 Rejoignez l'aventure, contribuez, et construisons ensemble quelque chose d'extraordinaire ! 💡✨
 N'oubliez pas de lier un problème existant ou d'ouvrir un nouveau problème dans la description de votre PR.
 ### Rapports de bugs
 > [!IMPORTANT]
 > Veuillez vous assurer d'inclure les informations suivantes lors de la soumission d'un rapport de bug :
 - Un titre clair et descriptif
 - Une description détaillée du bug, y compris tous les messages d'erreur
 - Les étapes pour reproduire le bug
 - Comportement attendu
 - **Logs**, si disponibles, pour les problèmes de backend, c'est vraiment important, vous pouvez les trouver dans les logs de docker-compose
 - Captures d'écran ou vidéos, si applicable
 Comment nous priorisons :
  | Type de Problème                                              | Priorité        |
  | ------------------------------------------------------------ | --------------- |
  | Bugs dans les fonctions principales (service cloud, impossibilité de se connecter, applications qui ne fonctionnent pas, failles de sécurité) | Critique        |
  | Bugs non critiques, améliorations de performance             | Priorité Moyenne |
  | Corrections mineures (fautes de frappe, UI confuse mais fonctionnelle) | Priorité Basse  |
 ### Demandes de fonctionnalités
 > [!NOTE]
 > Veuillez vous assurer d'inclure les informations suivantes lors de la soumission d'une demande de fonctionnalité :
 - Un titre clair et descriptif
 - Une description détaillée de la fonctionnalité
 - Un cas d'utilisation pour la fonctionnalité
 - Tout autre contexte ou captures d'écran concernant la demande de fonctionnalité
 Comment nous priorisons :
  | Type de Fonctionnalité                                        | Priorité        |
  | ------------------------------------------------------------ | --------------- |
  | Fonctionnalités hautement prioritaires étiquetées par un membre de l'équipe | Priorité Haute  |
  | Demandes populaires de fonctionnalités de notre [tableau de feedback communautaire](https://github.com/langgenius/dify/discussions/categories/feedbacks) | Priorité Moyenne |
  | Fonctionnalités non essentielles et améliorations mineures   | Priorité Basse  |
  | Précieuses mais non immédiates                               | Fonctionnalité Future |
 ## Soumettre votre PR
 ### Processus de Pull Request
 1. Forkez le dépôt
 2. Avant de rédiger une PR, veuillez créer un problème pour discuter des changements que vous souhaitez apporter
 3. Créez une nouvelle branche pour vos changements
 4. Veuillez ajouter des tests pour vos changements en conséquence
 5. Assurez-vous que votre code passe les tests existants
 6. Veuillez lier le problème dans la description de la PR, `fixes #<numéro_du_problème>`
 7. Faites fusionner votre code !
 ### Configuration du projet
 #### Frontend
 Pour configurer le service frontend, veuillez consulter notre [guide complet](https://github.com/langgenius/dify/blob/main/web/README.md) dans le fichier `web/README.md`. Ce document fournit des instructions détaillées pour vous aider à configurer correctement l'environnement frontend.
 #### Backend
 Pour configurer le service backend, veuillez consulter nos [instructions détaillées](https://github.com/langgenius/dify/blob/main/api/README.md) dans le fichier `api/README.md`. Ce document contient un guide étape par étape pour vous aider à faire fonctionner le backend sans problème.
 #### Autres choses à noter
 Nous recommandons de revoir attentivement ce document avant de procéder à la configuration, car il contient des informations essentielles sur :
 - Prérequis et dépendances
 - Étapes d'installation
 - Détails de configuration
 - Conseils courants de dépannage
 N'hésitez pas à nous contacter si vous rencontrez des problèmes pendant le processus de configuration.
 ## Obtenir de l'aide
 Si jamais vous êtes bloqué ou avez une question urgente en contribuant, envoyez-nous simplement vos questions via le problème GitHub concerné, ou rejoignez notre [Discord](https://discord.gg/8Tpq4AcN9c) pour une discussion rapide. 
--- a/CONTRIBUTING_KR.md
+++ b/CONTRIBUTING_KR.md
@ -0,0 +1,93 @@
 # 기여하기
 Dify에 기여하려고 하시는군요 - 정말 멋집니다, 당신이 무엇을 할지 기대가 됩니다. 인력과 자금이 제한된 스타트업으로서, 우리는 LLM 애플리케이션을 구축하고 관리하기 위한 가장 직관적인 워크플로우를 설계하고자 하는 큰 야망을 가지고 있습니다. 커뮤니티의 모든 도움은 정말 중요합니다.
 우리는 현재 상황에서 민첩하게 빠르게 배포해야 하지만, 동시에 당신과 같은 기여자들이 기여하는 과정에서 최대한 원활한 경험을 얻을 수 있도록 하고 싶습니다. 우리는 이러한 목적으로 이 기여 가이드를 작성했으며, 여러분이 코드베이스와 우리가 기여자들과 어떻게 협업하는지에 대해 친숙해질 수 있도록 돕고, 빠르게 재미있는 부분으로 넘어갈 수 있도록 하고자 합니다.
 이 가이드는 Dify 자체와 마찬가지로 끊임없이 진행 중인 작업입니다. 때로는 실제 프로젝트보다 뒤처질 수 있다는 점을 이해해 주시면 감사하겠으며, 개선을 위한 피드백은 언제든지 환영합니다.
 라이센스 측면에서, 간략한 [라이센스 및 기여자 동의서](./LICENSE)를 읽어보는 시간을 가져주세요. 커뮤니티는 또한 [행동 강령](https://github.com/langgenius/.github/blob/main/CODE_OF_CONDUCT.md)을 준수합니다.
 ## 시작하기 전에
 처리할 작업을 찾고 계신가요? [초보자를 위한 이슈](https://github.com/langgenius/dify/issues?q=is%3Aissue%20state%3Aopen%20label%3A%22good%20first%20issue%22)를 살펴보고 시작할 것을 선택하세요!
 추가할 새로운 모델 런타임이나 도구가 있나요? 우리의 [플러그인 저장소](https://github.com/langgenius/dify-plugins)에 PR을 열고 당신이 만든 것을 보여주세요.
 기존 모델 런타임, 도구를 업데이트하거나 버그를 수정해야 하나요? 우리의 [공식 플러그인 저장소](https://github.com/langgenius/dify-official-plugins)로 가서 당신의 마법을 펼치세요!
 함께 즐기고, 기여하고, 멋진 것을 함께 만들어 봅시다! 💡✨
 PR 설명에 기존 이슈를 연결하거나 새 이슈를 여는 것을 잊지 마세요.
 ### 버그 보고
 > [!IMPORTANT]
 > 버그 보고서를 제출할 때 다음 정보를 포함해 주세요:
 - 명확하고 설명적인 제목
 - 오류 메시지를 포함한 버그에 대한 상세한 설명
 - 버그를 재현하는 단계
 - 예상되는 동작
 - 가능한 경우 **로그**, 백엔드 이슈의 경우 매우 중요합니다. docker-compose 로그에서 찾을 수 있습니다
 - 해당되는 경우 스크린샷 또는 비디오
 우선순위 결정 방법:
  | 이슈 유형                                                     | 우선순위        |
  | ------------------------------------------------------------ | --------------- |
  | 핵심 기능의 버그(클라우드 서비스, 로그인 불가, 애플리케이션 작동 불능, 보안 취약점) | 중대            |
  | 비중요 버그, 성능 향상                                        | 중간 우선순위   |
  | 사소한 수정(오타, 혼란스럽지만 작동하는 UI)                    | 낮은 우선순위   |
 ### 기능 요청
 > [!NOTE]
 > 기능 요청을 제출할 때 다음 정보를 포함해 주세요:
 - 명확하고 설명적인 제목
 - 기능에 대한 상세한 설명
 - 해당 기능의 사용 사례
 - 기능 요청에 관한 기타 컨텍스트 또는 스크린샷
 우선순위 결정 방법:
  | 기능 유형                                                     | 우선순위        |
  | ------------------------------------------------------------ | --------------- |
  | 팀 구성원에 의해 레이블이 지정된 고우선순위 기능               | 높은 우선순위   |
  | 우리의 [커뮤니티 피드백 보드](https://github.com/langgenius/dify/discussions/categories/feedbacks)에서 인기 있는 기능 요청 | 중간 우선순위   |
  | 비핵심 기능 및 사소한 개선                                     | 낮은 우선순위   |
  | 가치 있지만 즉시 필요하지 않은 기능                            | 미래 기능       |
 ## PR 제출하기
 ### Pull Request 프로세스
 1. 저장소를 포크하세요
 2. PR을 작성하기 전에, 변경하고자 하는 내용에 대해 논의하기 위한 이슈를 생성해 주세요
 3. 변경 사항을 위한 새 브랜치를 만드세요
 4. 변경 사항에 대한 테스트를 적절히 추가해 주세요
 5. 코드가 기존 테스트를 통과하는지 확인하세요
 6. PR 설명에 이슈를 연결해 주세요, `fixes #<이슈_번호>`
 7. 병합 완료!
 ### 프로젝트 설정하기
 #### 프론트엔드
 프론트엔드 서비스를 설정하려면, `web/README.md` 파일에 있는 우리의 [종합 가이드](https://github.com/langgenius/dify/blob/main/web/README.md)를 참조하세요. 이 문서는 프론트엔드 환경을 적절히 설정하는 데 도움이 되는 자세한 지침을 제공합니다.
 #### 백엔드
 백엔드 서비스를 설정하려면, `api/README.md` 파일에 있는 우리의 [상세 지침](https://github.com/langgenius/dify/blob/main/api/README.md)을 참조하세요. 이 문서는 백엔드를 원활하게 실행하는 데 도움이 되는 단계별 가이드를 포함하고 있습니다.
 #### 기타 참고 사항
 설정을 진행하기 전에 이 문서를 주의 깊게 검토하는 것을 권장합니다. 다음과 같은 필수 정보가 포함되어 있습니다:
 - 필수 조건 및 종속성
 - 설치 단계
 - 구성 세부 정보
 - 일반적인 문제 해결 팁
 설정 과정에서 문제가 발생하면 언제든지 연락해 주세요.
 ## 도움 받기
 기여하는 동안 막히거나 긴급한 질문이 있으면, 관련 GitHub 이슈를 통해 질문을 보내거나, 빠른 대화를 위해 우리의 [Discord](https://discord.gg/8Tpq4AcN9c)에 참여하세요. 
--- a/CONTRIBUTING_PT.md
+++ b/CONTRIBUTING_PT.md
@ -0,0 +1,93 @@
 # CONTRIBUINDO
 Então você está procurando contribuir para o Dify - isso é incrível, mal podemos esperar para ver o que você vai fazer. Como uma startup com equipe e financiamento limitados, temos grandes ambições de projetar o fluxo de trabalho mais intuitivo para construir e gerenciar aplicações LLM. Qualquer ajuda da comunidade conta, verdadeiramente.
 Precisamos ser ágeis e entregar rapidamente considerando onde estamos, mas também queremos garantir que colaboradores como você tenham uma experiência o mais tranquila possível ao contribuir. Montamos este guia de contribuição com esse propósito, visando familiarizá-lo com a base de código e como trabalhamos com os colaboradores, para que você possa rapidamente passar para a parte divertida.
 Este guia, como o próprio Dify, é um trabalho em constante evolução. Agradecemos muito a sua compreensão se às vezes ele ficar atrasado em relação ao projeto real, e damos as boas-vindas a qualquer feedback para que possamos melhorar.
 Em termos de licenciamento, por favor, dedique um minuto para ler nosso breve [Acordo de Licença e Contribuidor](./LICENSE). A comunidade também adere ao [código de conduta](https://github.com/langgenius/.github/blob/main/CODE_OF_CONDUCT.md).
 ## Antes de começar
 Procurando algo para resolver? Navegue por nossos [problemas para iniciantes](https://github.com/langgenius/dify/issues?q=is%3Aissue%20state%3Aopen%20label%3A%22good%20first%20issue%22) e escolha um para começar!
 Tem um novo modelo ou ferramenta para adicionar? Abra um PR em nosso [repositório de plugins](https://github.com/langgenius/dify-plugins) e mostre-nos o que você construiu.
 Precisa atualizar um modelo existente, ferramenta ou corrigir alguns bugs? Vá para nosso [repositório oficial de plugins](https://github.com/langgenius/dify-official-plugins) e faça sua mágica!
 Junte-se à diversão, contribua e vamos construir algo incrível juntos! 💡✨
 Não se esqueça de vincular um problema existente ou abrir um novo problema na descrição do PR.
 ### Relatórios de bugs
 > [!IMPORTANT]
 > Por favor, certifique-se de incluir as seguintes informações ao enviar um relatório de bug:
 - Um título claro e descritivo
 - Uma descrição detalhada do bug, incluindo quaisquer mensagens de erro
 - Passos para reproduzir o bug
 - Comportamento esperado
 - **Logs**, se disponíveis, para problemas de backend, isso é realmente importante, você pode encontrá-los nos logs do docker-compose
 - Capturas de tela ou vídeos, se aplicável
 Como priorizamos:
  | Tipo de Problema                                              | Prioridade      |
  | ------------------------------------------------------------ | --------------- |
  | Bugs em funções centrais (serviço em nuvem, não conseguir fazer login, aplicações não funcionando, falhas de segurança) | Crítica         |
  | Bugs não críticos, melhorias de desempenho                   | Prioridade Média |
  | Correções menores (erros de digitação, interface confusa mas funcional) | Prioridade Baixa |
 ### Solicitações de recursos
 > [!NOTE]
 > Por favor, certifique-se de incluir as seguintes informações ao enviar uma solicitação de recurso:
 - Um título claro e descritivo
 - Uma descrição detalhada do recurso
 - Um caso de uso para o recurso
 - Qualquer outro contexto ou capturas de tela sobre a solicitação de recurso
 Como priorizamos:
  | Tipo de Recurso                                              | Prioridade      |
  | ------------------------------------------------------------ | --------------- |
  | Recursos de alta prioridade conforme rotulado por um membro da equipe | Prioridade Alta |
  | Solicitações populares de recursos do nosso [quadro de feedback da comunidade](https://github.com/langgenius/dify/discussions/categories/feedbacks) | Prioridade Média |
  | Recursos não essenciais e melhorias menores                  | Prioridade Baixa |
  | Valiosos mas não imediatos                                  | Recurso Futuro  |
 ## Enviando seu PR
 ### Processo de Pull Request
 1. Faça um fork do repositório
 2. Antes de elaborar um PR, por favor crie um problema para discutir as mudanças que você quer fazer
 3. Crie um novo branch para suas alterações
 4. Por favor, adicione testes para suas alterações conforme apropriado
 5. Certifique-se de que seu código passa nos testes existentes
 6. Por favor, vincule o problema na descrição do PR, `fixes #<número_do_problema>`
 7. Faça o merge do seu código!
 ### Configurando o projeto
 #### Frontend
 Para configurar o serviço frontend, por favor consulte nosso [guia abrangente](https://github.com/langgenius/dify/blob/main/web/README.md) no arquivo `web/README.md`. Este documento fornece instruções detalhadas para ajudá-lo a configurar o ambiente frontend adequadamente.
 #### Backend
 Para configurar o serviço backend, por favor consulte nossas [instruções detalhadas](https://github.com/langgenius/dify/blob/main/api/README.md) no arquivo `api/README.md`. Este documento contém um guia passo a passo para ajudá-lo a colocar o backend em funcionamento sem problemas.
 #### Outras coisas a observar
 Recomendamos revisar este documento cuidadosamente antes de prosseguir com a configuração, pois ele contém informações essenciais sobre:
 - Pré-requisitos e dependências
 - Etapas de instalação
 - Detalhes de configuração
 - Dicas comuns de solução de problemas
 Sinta-se à vontade para entrar em contato se encontrar quaisquer problemas durante o processo de configuração.
 ## Obtendo Ajuda
 Se você ficar preso ou tiver uma dúvida urgente enquanto contribui, simplesmente envie suas perguntas através do problema relacionado no GitHub, ou entre no nosso [Discord](https://discord.gg/8Tpq4AcN9c) para uma conversa rápida. 
--- a/CONTRIBUTING_TR.md
+++ b/CONTRIBUTING_TR.md
@ -0,0 +1,93 @@
 # KATKIDA BULUNMAK
 Demek Dify'a katkıda bulunmak istiyorsunuz - bu harika, ne yapacağınızı görmek için sabırsızlanıyoruz. Sınırlı personel ve finansmana sahip bir startup olarak, LLM uygulamaları oluşturmak ve yönetmek için en sezgisel iş akışını tasarlama konusunda büyük hedeflerimiz var. Topluluktan gelen her türlü yardım gerçekten önemli.
 Bulunduğumuz noktada çevik olmamız ve hızlı hareket etmemiz gerekiyor, ancak sizin gibi katkıda bulunanların mümkün olduğunca sorunsuz bir deneyim yaşamasını da sağlamak istiyoruz. Bu katkı rehberini bu amaçla hazırladık; sizi kod tabanıyla ve katkıda bulunanlarla nasıl çalıştığımızla tanıştırmayı, böylece hızlıca eğlenceli kısma geçebilmenizi hedefliyoruz.
 Bu rehber, Dify'ın kendisi gibi, sürekli gelişen bir çalışmadır. Bazen gerçek projenin gerisinde kalırsa anlayışınız için çok minnettarız ve gelişmemize yardımcı olacak her türlü geri bildirimi memnuniyetle karşılıyoruz.
 Lisanslama konusunda, lütfen kısa [Lisans ve Katkıda Bulunan Anlaşmamızı](./LICENSE) okumak için bir dakikanızı ayırın. Topluluk ayrıca [davranış kurallarına](https://github.com/langgenius/.github/blob/main/CODE_OF_CONDUCT.md) da uyar.
 ## Başlamadan Önce
 Üzerinde çalışacak bir şey mi arıyorsunuz? [İlk katkıda bulunanlar için iyi sorunlarımıza](https://github.com/langgenius/dify/issues?q=is%3Aissue%20state%3Aopen%20label%3A%22good%20first%20issue%22) göz atın ve başlamak için birini seçin!
 Eklenecek harika bir yeni model runtime'ı veya aracınız mı var? [Eklenti depomuzda](https://github.com/langgenius/dify-plugins) bir PR açın ve ne yaptığınızı bize gösterin.
 Mevcut bir model runtime'ını, aracı güncellemek veya bazı hataları düzeltmek mi istiyorsunuz? [Resmi eklenti depomuza](https://github.com/langgenius/dify-official-plugins) gidin ve sihrinizi gösterin!
 Eğlenceye katılın, katkıda bulunun ve birlikte harika bir şeyler inşa edelim! 💡✨
 PR açıklamasında mevcut bir sorunu bağlamayı veya yeni bir sorun açmayı unutmayın.
 ### Hata Raporları
 > [!IMPORTANT]
 > Lütfen bir hata raporu gönderirken aşağıdaki bilgileri dahil ettiğinizden emin olun:
 - Net ve açıklayıcı bir başlık
 - Hata mesajları dahil hatanın ayrıntılı bir açıklaması
 - Hatayı tekrarlamak için adımlar
 - Beklenen davranış
 - Mümkünse **Loglar**, backend sorunları için, bu gerçekten önemlidir, bunları docker-compose loglarında bulabilirsiniz
 - Uygunsa ekran görüntüleri veya videolar
 Nasıl önceliklendiriyoruz:
  | Sorun Türü                                                    | Öncelik         |
  | ------------------------------------------------------------ | --------------- |
  | Temel işlevlerdeki hatalar (bulut hizmeti, giriş yapamama, çalışmayan uygulamalar, güvenlik açıkları) | Kritik          |
  | Kritik olmayan hatalar, performans artışları                 | Orta Öncelik    |
  | Küçük düzeltmeler (yazım hataları, kafa karıştırıcı ama çalışan UI) | Düşük Öncelik   |
 ### Özellik İstekleri
 > [!NOTE]
 > Lütfen bir özellik isteği gönderirken aşağıdaki bilgileri dahil ettiğinizden emin olun:
 - Net ve açıklayıcı bir başlık
 - Özelliğin ayrıntılı bir açıklaması
 - Özellik için bir kullanım durumu
 - Özellik isteği hakkında diğer bağlamlar veya ekran görüntüleri
 Nasıl önceliklendiriyoruz:
  | Özellik Türü                                                 | Öncelik         |
  | ------------------------------------------------------------ | --------------- |
  | Bir ekip üyesi tarafından etiketlenen Yüksek Öncelikli Özellikler | Yüksek Öncelik  |
  | [Topluluk geri bildirim panosundan](https://github.com/langgenius/dify/discussions/categories/feedbacks) popüler özellik istekleri | Orta Öncelik    |
  | Temel olmayan özellikler ve küçük geliştirmeler              | Düşük Öncelik   |
  | Değerli ama acil olmayan                                     | Gelecek-Özellik |
 ## PR'nizi Göndermek
 ### Pull Request Süreci
 1. Depoyu fork edin
 2. Bir PR taslağı oluşturmadan önce, yapmak istediğiniz değişiklikleri tartışmak için lütfen bir sorun oluşturun
 3. Değişiklikleriniz için yeni bir dal oluşturun
 4. Lütfen değişiklikleriniz için uygun testler ekleyin
 5. Kodunuzun mevcut testleri geçtiğinden emin olun
 6. Lütfen PR açıklamasında sorunu bağlayın, `fixes #<sorun_numarası>`
 7. Kodunuzu birleştirin!
 ### Projeyi Kurma
 #### Frontend
 Frontend hizmetini kurmak için, lütfen `web/README.md` dosyasındaki kapsamlı [rehberimize](https://github.com/langgenius/dify/blob/main/web/README.md) bakın. Bu belge, frontend ortamını düzgün bir şekilde kurmanıza yardımcı olacak ayrıntılı talimatlar sağlar.
 #### Backend
 Backend hizmetini kurmak için, lütfen `api/README.md` dosyasındaki detaylı [talimatlarımıza](https://github.com/langgenius/dify/blob/main/api/README.md) bakın. Bu belge, backend'i sorunsuz bir şekilde çalıştırmanıza yardımcı olacak adım adım bir kılavuz içerir.
 #### Dikkat Edilecek Diğer Şeyler
 Kuruluma geçmeden önce bu belgeyi dikkatlice incelemenizi öneririz, çünkü şunlar hakkında temel bilgiler içerir:
 - Ön koşullar ve bağımlılıklar
 - Kurulum adımları
 - Yapılandırma detayları
 - Yaygın sorun giderme ipuçları
 Kurulum süreci sırasında herhangi bir sorunla karşılaşırsanız bizimle iletişime geçmekten çekinmeyin.
 ## Yardım Almak
 Katkıda bulunurken takılırsanız veya yanıcı bir sorunuz olursa, sorularınızı ilgili GitHub sorunu aracılığıyla bize gönderin veya hızlı bir sohbet için [Discord'umuza](https://discord.gg/8Tpq4AcN9c) katılın. 
--- a/2
+++ b/2
@ -10,8 +10,6 @@ a. Multi-tenant service: Unless explicitly authorized by Dify in writing, you ma
 b. LOGO and copyright information: In the process of using Dify's frontend, you may not remove or modify the LOGO or copyright information in the Dify console or applications. This restriction is inapplicable to uses of Dify that do not involve its frontend.
    - Frontend Definition: For the purposes of this license, the "frontend" of Dify includes all components located in the `web/` directory when running Dify from the raw source code, or the "web" image when running Dify with Docker.
 Please contact business@dify.ai by email to inquire about licensing matters.
 2. As a contributor, you should agree that:
 a. The producer can adjust the open-source agreement to be more strict or relaxed as deemed necessary.
--- a/api/.env.example
+++ b/api/.env.example
@ -137,7 +137,7 @@ WEB_API_CORS_ALLOW_ORIGINS=http://127.0.0.1:3000,*
 CONSOLE_CORS_ALLOW_ORIGINS=http://127.0.0.1:3000,*
 # Vector database configuration
-# support: weaviate, qdrant, milvus, myscale, relyt, pgvecto_rs, pgvector, pgvector, chroma, opensearch, tidb_vector, couchbase, vikingdb, upstash, lindorm, oceanbase, opengauss
+# support: weaviate, qdrant, milvus, myscale, relyt, pgvecto_rs, pgvector, pgvector, chroma, opensearch, tidb_vector, couchbase, vikingdb, upstash, lindorm, oceanbase, opengauss, tablestore
 VECTOR_STORE=weaviate
 # Weaviate configuration
@ -212,6 +212,12 @@ PGVECTOR_DATABASE=postgres
 PGVECTOR_MIN_CONNECTION=1
 PGVECTOR_MAX_CONNECTION=5
 # TableStore Vector configuration
 TABLESTORE_ENDPOINT=https://instance-name.cn-hangzhou.ots.aliyuncs.com
 TABLESTORE_INSTANCE_NAME=instance-name
 TABLESTORE_ACCESS_KEY_ID=xxx
 TABLESTORE_ACCESS_KEY_SECRET=xxx
 # Tidb Vector configuration
 TIDB_VECTOR_HOST=xxx.eu-central-1.xxx.aws.tidbcloud.com
 TIDB_VECTOR_PORT=4000
@ -297,6 +303,7 @@ OCEANBASE_VECTOR_USER=root@test
 OCEANBASE_VECTOR_PASSWORD=difyai123456
 OCEANBASE_VECTOR_DATABASE=test
 OCEANBASE_MEMORY_LIMIT=6G
 OCEANBASE_ENABLE_HYBRID_SEARCH=false
 # openGauss configuration
 OPENGAUSS_HOST=127.0.0.1
--- a/api/commands.py
+++ b/api/commands.py
@ -276,6 +276,7 @@ def migrate_knowledge_vector_database():
        VectorType.ORACLE,
        VectorType.ELASTICSEARCH,
        VectorType.OPENGAUSS,
        VectorType.TABLESTORE,
    }
    lower_collection_vector_types = {
        VectorType.ANALYTICDB,
--- a/api/configs/feature/hosted_service/init.py
+++ b/api/configs/feature/hosted_service/init.py
@ -1,6 +1,6 @@
 from typing import Optional
-from pydantic import Field, NonNegativeInt, computed_field
+from pydantic import Field, NonNegativeInt
 from pydantic_settings import BaseSettings
--- a/api/configs/middleware/init.py
+++ b/api/configs/middleware/init.py
@ -33,6 +33,7 @@ from .vdb.pgvector_config import PGVectorConfig
 from .vdb.pgvectors_config import PGVectoRSConfig
 from .vdb.qdrant_config import QdrantConfig
 from .vdb.relyt_config import RelytConfig
 from .vdb.tablestore_config import TableStoreConfig
 from .vdb.tencent_vector_config import TencentVectorDBConfig
 from .vdb.tidb_on_qdrant_config import TidbOnQdrantConfig
 from .vdb.tidb_vector_config import TiDBVectorConfig
@ -283,5 +284,6 @@ class MiddlewareConfig(
    OceanBaseVectorConfig,
    BaiduVectorDBConfig,
    OpenGaussConfig,
    TableStoreConfig,
 ):
    pass
--- a/api/configs/middleware/vdb/oceanbase_config.py
+++ b/api/configs/middleware/vdb/oceanbase_config.py
@ -33,3 +33,9 @@ class OceanBaseVectorConfig(BaseSettings):
        description="Name of the OceanBase Vector database to connect to",
        default=None,
    )
    OCEANBASE_ENABLE_HYBRID_SEARCH: bool = Field(
        description="Enable hybrid search features (requires OceanBase >= 4.3.5.1). Set to false for compatibility "
        "with older versions",
        default=False,
    )
--- a/api/configs/middleware/vdb/opengauss_config.py
+++ b/api/configs/middleware/vdb/opengauss_config.py
@ -43,3 +43,8 @@ class OpenGaussConfig(BaseSettings):
        description="Max connection of the OpenGauss database",
        default=5,
    )
    OPENGAUSS_ENABLE_PQ: bool = Field(
        description="Enable openGauss PQ acceleration feature",
        default=False,
    )
--- a/api/configs/middleware/vdb/tablestore_config.py
+++ b/api/configs/middleware/vdb/tablestore_config.py
@ -0,0 +1,30 @@
 from typing import Optional
 from pydantic import Field
 from pydantic_settings import BaseSettings
 class TableStoreConfig(BaseSettings):
    """
    Configuration settings for TableStore.
    """
    TABLESTORE_ENDPOINT: Optional[str] = Field(
        description="Endpoint address of the TableStore server (e.g. 'https://instance-name.cn-hangzhou.ots.aliyuncs.com')",
        default=None,
    )
    TABLESTORE_INSTANCE_NAME: Optional[str] = Field(
        description="Instance name to access TableStore server (eg. 'instance-name')",
        default=None,
    )
    TABLESTORE_ACCESS_KEY_ID: Optional[str] = Field(
        description="AccessKey id for the instance name",
        default=None,
    )
    TABLESTORE_ACCESS_KEY_SECRET: Optional[str] = Field(
        description="AccessKey secret for the instance name",
        default=None,
    )
--- a/api/configs/packaging/init.py
+++ b/api/configs/packaging/init.py
@ -9,7 +9,7 @@ class PackagingInfo(BaseSettings):
    CURRENT_VERSION: str = Field(
        description="Dify version",
-        default="1.1.2",
+        default="1.1.3",
    )
    COMMIT_SHA: str = Field(
--- a/api/configs/remote_settings_sources/init.py
+++ b/api/configs/remote_settings_sources/init.py
@ -1,5 +1,3 @@
 from typing import Optional
 from pydantic import Field
 from .apollo import ApolloSettingsSourceInfo
--- a/api/controllers/console/datasets/datasets.py
+++ b/api/controllers/console/datasets/datasets.py
@ -646,7 +646,6 @@ class DatasetRetrievalSettingApi(Resource):
                | VectorType.BAIDU
                | VectorType.VIKINGDB
                | VectorType.UPSTASH
                | VectorType.OCEANBASE
            ):
                return {"retrieval_method": [RetrievalMethod.SEMANTIC_SEARCH.value]}
            case (
@ -664,6 +663,8 @@ class DatasetRetrievalSettingApi(Resource):
                | VectorType.COUCHBASE
                | VectorType.MILVUS
                | VectorType.OPENGAUSS
                | VectorType.OCEANBASE
                | VectorType.TABLESTORE
            ):
                return {
                    "retrieval_method": [
@ -692,7 +693,6 @@ class DatasetRetrievalSettingMockApi(Resource):
                | VectorType.BAIDU
                | VectorType.VIKINGDB
                | VectorType.UPSTASH
                | VectorType.OCEANBASE
            ):
                return {"retrieval_method": [RetrievalMethod.SEMANTIC_SEARCH.value]}
            case (
@ -708,6 +708,8 @@ class DatasetRetrievalSettingMockApi(Resource):
                | VectorType.PGVECTOR
                | VectorType.LINDORM
                | VectorType.OPENGAUSS
                | VectorType.OCEANBASE
                | VectorType.TABLESTORE
            ):
                return {
                    "retrieval_method": [
--- a/api/controllers/inner_api/workspace/workspace.py
+++ b/api/controllers/inner_api/workspace/workspace.py
@ -6,6 +6,7 @@ from controllers.console.wraps import setup_required
 from controllers.inner_api import api
 from controllers.inner_api.wraps import enterprise_inner_api_only
 from events.tenant_event import tenant_was_created
 from extensions.ext_database import db
 from models.account import Account
 from services.account_service import TenantService
@ -19,7 +20,7 @@ class EnterpriseWorkspace(Resource):
        parser.add_argument("owner_email", type=str, required=True, location="json")
        args = parser.parse_args()
-        account = Account.query.filter_by(email=args["owner_email"]).first()
+        account = db.session.query(Account).filter_by(email=args["owner_email"]).first()
        if account is None:
            return {"message": "owner account not found."}, 404
--- a/api/controllers/service_api/app/workflow.py
+++ b/api/controllers/service_api/app/workflow.py
@ -27,6 +27,7 @@ from core.model_runtime.errors.invoke import InvokeError
 from extensions.ext_database import db
 from fields.workflow_app_log_fields import workflow_app_log_pagination_fields
 from libs import helper
 from libs.helper import TimestampField
 from models.model import App, AppMode, EndUser
 from models.workflow import WorkflowRun, WorkflowRunStatus
 from services.app_generate_service import AppGenerateService
@ -44,8 +45,8 @@ workflow_run_fields = {
    "error": fields.String,
    "total_steps": fields.Integer,
    "total_tokens": fields.Integer,
-    "created_at": fields.DateTime,
+    "created_at": TimestampField,
-    "finished_at": fields.DateTime,
+    "finished_at": TimestampField,
    "elapsed_time": fields.Float,
 }
@ -53,7 +54,7 @@ workflow_run_fields = {
 class WorkflowRunDetailApi(Resource):
    @validate_app_token
    @marshal_with(workflow_run_fields)
-    def get(self, app_model: App, workflow_id: str):
+    def get(self, app_model: App, workflow_run_id: str):
        """
        Get a workflow task running detail
        """
@ -61,7 +62,7 @@ class WorkflowRunDetailApi(Resource):
        if app_mode != AppMode.WORKFLOW:
            raise NotWorkflowAppError()
-        workflow_run = db.session.query(WorkflowRun).filter(WorkflowRun.id == workflow_id).first()
+        workflow_run = db.session.query(WorkflowRun).filter(WorkflowRun.id == workflow_run_id).first()
        return workflow_run
@ -162,6 +163,6 @@ class WorkflowAppLogApi(Resource):
 api.add_resource(WorkflowRunApi, "/workflows/run")
-api.add_resource(WorkflowRunDetailApi, "/workflows/run/<string:workflow_id>")
+api.add_resource(WorkflowRunDetailApi, "/workflows/run/<string:workflow_run_id>")
 api.add_resource(WorkflowTaskStopApi, "/workflows/tasks/<string:task_id>/stop")
 api.add_resource(WorkflowAppLogApi, "/workflows/logs")
--- a/api/controllers/service_api/dataset/dataset.py
+++ b/api/controllers/service_api/dataset/dataset.py
@ -142,6 +142,7 @@ class DatasetApi(DatasetApiResource):
        Deletes a dataset given its ID.
        Args:
            _: ignore
            dataset_id (UUID): The ID of the dataset to be deleted.
        Returns:
--- a/api/controllers/service_api/dataset/segment.py
+++ b/api/controllers/service_api/dataset/segment.py
@ -1,3 +1,4 @@
 from flask import request
 from flask_login import current_user  # type: ignore
 from flask_restful import marshal, reqparse  # type: ignore
 from werkzeug.exceptions import NotFound
@ -13,10 +14,20 @@ from core.errors.error import LLMBadRequestError, ProviderTokenNotInitError
 from core.model_manager import ModelManager
 from core.model_runtime.entities.model_entities import ModelType
 from extensions.ext_database import db
-from fields.segment_fields import segment_fields
+from fields.segment_fields import child_chunk_fields, segment_fields
-from models.dataset import Dataset, DocumentSegment
+from models.dataset import Dataset
 from services.dataset_service import DatasetService, DocumentService, SegmentService
 from services.entities.knowledge_entities.knowledge_entities import SegmentUpdateArgs
 from services.errors.chunk import (
    ChildChunkDeleteIndexError,
    ChildChunkIndexingError,
 )
 from services.errors.chunk import (
    ChildChunkDeleteIndexError as ChildChunkDeleteIndexServiceError,
 )
 from services.errors.chunk import (
    ChildChunkIndexingError as ChildChunkIndexingServiceError,
 )
 class SegmentApi(DatasetApiResource):
@ -70,10 +81,12 @@ class SegmentApi(DatasetApiResource):
            return {"error": "Segments is required"}, 400
    def get(self, tenant_id, dataset_id, document_id):
-        """Create single segment."""
+        """Get segments."""
        # check dataset
        dataset_id = str(dataset_id)
        tenant_id = str(tenant_id)
        page = request.args.get("page", default=1, type=int)
        limit = request.args.get("limit", default=20, type=int)
        dataset = db.session.query(Dataset).filter(Dataset.tenant_id == tenant_id, Dataset.id == dataset_id).first()
        if not dataset:
            raise NotFound("Dataset not found.")
@ -107,19 +120,23 @@ class SegmentApi(DatasetApiResource):
        status_list = args["status"]
        keyword = args["keyword"]
-        query = DocumentSegment.query.filter(
+        segments, total = SegmentService.get_segments(
-            DocumentSegment.document_id == str(document_id), DocumentSegment.tenant_id == current_user.current_tenant_id
+            document_id=document_id,
            tenant_id=current_user.current_tenant_id,
            status_list=args["status"],
            keyword=args["keyword"],
        )
-        if status_list:
+        response = {
-            query = query.filter(DocumentSegment.status.in_(status_list))
+            "data": marshal(segments, segment_fields),
            "doc_form": document.doc_form,
            "total": total,
            "has_more": len(segments) == limit,
            "limit": limit,
            "page": page,
        }
-        if keyword:
+        return response, 200
            query = query.where(DocumentSegment.content.ilike(f"%{keyword}%"))
        total = query.count()
        segments = query.order_by(DocumentSegment.position).all()
        return {"data": marshal(segments, segment_fields), "doc_form": document.doc_form, "total": total}, 200
 class DatasetSegmentApi(DatasetApiResource):
@ -138,9 +155,8 @@ class DatasetSegmentApi(DatasetApiResource):
        if not document:
            raise NotFound("Document not found.")
        # check segment
-        segment = DocumentSegment.query.filter(
+        segment_id = str(segment_id)
-            DocumentSegment.id == str(segment_id), DocumentSegment.tenant_id == current_user.current_tenant_id
+        segment = SegmentService.get_segment_by_id(segment_id=segment_id, tenant_id=current_user.current_tenant_id)
        ).first()
        if not segment:
            raise NotFound("Segment not found.")
        SegmentService.delete_segment(segment, document, dataset)
@ -179,9 +195,7 @@ class DatasetSegmentApi(DatasetApiResource):
                raise ProviderNotInitializeError(ex.description)
            # check segment
        segment_id = str(segment_id)
-        segment = DocumentSegment.query.filter(
+        segment = SegmentService.get_segment_by_id(segment_id=segment_id, tenant_id=current_user.current_tenant_id)
            DocumentSegment.id == str(segment_id), DocumentSegment.tenant_id == current_user.current_tenant_id
        ).first()
        if not segment:
            raise NotFound("Segment not found.")
@ -190,12 +204,200 @@ class DatasetSegmentApi(DatasetApiResource):
        parser.add_argument("segment", type=dict, required=False, nullable=True, location="json")
        args = parser.parse_args()
-        SegmentService.segment_create_args_validate(args["segment"], document)
+        updated_segment = SegmentService.update_segment(
-        segment = SegmentService.update_segment(SegmentUpdateArgs(**args["segment"]), segment, document, dataset)
+            SegmentUpdateArgs(**args["segment"]), segment, document, dataset
-        return {"data": marshal(segment, segment_fields), "doc_form": document.doc_form}, 200
+        )
        return {"data": marshal(updated_segment, segment_fields), "doc_form": document.doc_form}, 200
 class ChildChunkApi(DatasetApiResource):
    """Resource for child chunks."""
    @cloud_edition_billing_resource_check("vector_space", "dataset")
    @cloud_edition_billing_knowledge_limit_check("add_segment", "dataset")
    def post(self, tenant_id, dataset_id, document_id, segment_id):
        """Create child chunk."""
        # check dataset
        dataset_id = str(dataset_id)
        tenant_id = str(tenant_id)
        dataset = db.session.query(Dataset).filter(Dataset.tenant_id == tenant_id, Dataset.id == dataset_id).first()
        if not dataset:
            raise NotFound("Dataset not found.")
        # check document
        document_id = str(document_id)
        document = DocumentService.get_document(dataset.id, document_id)
        if not document:
            raise NotFound("Document not found.")
        # check segment
        segment_id = str(segment_id)
        segment = SegmentService.get_segment_by_id(segment_id=segment_id, tenant_id=current_user.current_tenant_id)
        if not segment:
            raise NotFound("Segment not found.")
        # check embedding model setting
        if dataset.indexing_technique == "high_quality":
            try:
                model_manager = ModelManager()
                model_manager.get_model_instance(
                    tenant_id=current_user.current_tenant_id,
                    provider=dataset.embedding_model_provider,
                    model_type=ModelType.TEXT_EMBEDDING,
                    model=dataset.embedding_model,
                )
            except LLMBadRequestError:
                raise ProviderNotInitializeError(
                    "No Embedding Model available. Please configure a valid provider in the Settings -> Model Provider."
                )
            except ProviderTokenNotInitError as ex:
                raise ProviderNotInitializeError(ex.description)
        # validate args
        parser = reqparse.RequestParser()
        parser.add_argument("content", type=str, required=True, nullable=False, location="json")
        args = parser.parse_args()
        try:
            child_chunk = SegmentService.create_child_chunk(args.get("content"), segment, document, dataset)
        except ChildChunkIndexingServiceError as e:
            raise ChildChunkIndexingError(str(e))
        return {"data": marshal(child_chunk, child_chunk_fields)}, 200
    def get(self, tenant_id, dataset_id, document_id, segment_id):
        """Get child chunks."""
        # check dataset
        dataset_id = str(dataset_id)
        tenant_id = str(tenant_id)
        dataset = db.session.query(Dataset).filter(Dataset.tenant_id == tenant_id, Dataset.id == dataset_id).first()
        if not dataset:
            raise NotFound("Dataset not found.")
        # check document
        document_id = str(document_id)
        document = DocumentService.get_document(dataset.id, document_id)
        if not document:
            raise NotFound("Document not found.")
        # check segment
        segment_id = str(segment_id)
        segment = SegmentService.get_segment_by_id(segment_id=segment_id, tenant_id=current_user.current_tenant_id)
        if not segment:
            raise NotFound("Segment not found.")
        parser = reqparse.RequestParser()
        parser.add_argument("limit", type=int, default=20, location="args")
        parser.add_argument("keyword", type=str, default=None, location="args")
        parser.add_argument("page", type=int, default=1, location="args")
        args = parser.parse_args()
        page = args["page"]
        limit = min(args["limit"], 100)
        keyword = args["keyword"]
        child_chunks = SegmentService.get_child_chunks(segment_id, document_id, dataset_id, page, limit, keyword)
        return {
            "data": marshal(child_chunks.items, child_chunk_fields),
            "total": child_chunks.total,
            "total_pages": child_chunks.pages,
            "page": page,
            "limit": limit,
        }, 200
 class DatasetChildChunkApi(DatasetApiResource):
    """Resource for updating child chunks."""
    @cloud_edition_billing_knowledge_limit_check("add_segment", "dataset")
    def delete(self, tenant_id, dataset_id, document_id, segment_id, child_chunk_id):
        """Delete child chunk."""
        # check dataset
        dataset_id = str(dataset_id)
        tenant_id = str(tenant_id)
        dataset = db.session.query(Dataset).filter(Dataset.tenant_id == tenant_id, Dataset.id == dataset_id).first()
        if not dataset:
            raise NotFound("Dataset not found.")
        # check document
        document_id = str(document_id)
        document = DocumentService.get_document(dataset.id, document_id)
        if not document:
            raise NotFound("Document not found.")
        # check segment
        segment_id = str(segment_id)
        segment = SegmentService.get_segment_by_id(segment_id=segment_id, tenant_id=current_user.current_tenant_id)
        if not segment:
            raise NotFound("Segment not found.")
        # check child chunk
        child_chunk_id = str(child_chunk_id)
        child_chunk = SegmentService.get_child_chunk_by_id(
            child_chunk_id=child_chunk_id, tenant_id=current_user.current_tenant_id
        )
        if not child_chunk:
            raise NotFound("Child chunk not found.")
        try:
            SegmentService.delete_child_chunk(child_chunk, dataset)
        except ChildChunkDeleteIndexServiceError as e:
            raise ChildChunkDeleteIndexError(str(e))
        return {"result": "success"}, 200
    @cloud_edition_billing_resource_check("vector_space", "dataset")
    @cloud_edition_billing_knowledge_limit_check("add_segment", "dataset")
    def patch(self, tenant_id, dataset_id, document_id, segment_id, child_chunk_id):
        """Update child chunk."""
        # check dataset
        dataset_id = str(dataset_id)
        tenant_id = str(tenant_id)
        dataset = db.session.query(Dataset).filter(Dataset.tenant_id == tenant_id, Dataset.id == dataset_id).first()
        if not dataset:
            raise NotFound("Dataset not found.")
        # get document
        document = DocumentService.get_document(dataset_id, document_id)
        if not document:
            raise NotFound("Document not found.")
        # get segment
        segment = SegmentService.get_segment_by_id(segment_id=segment_id, tenant_id=current_user.current_tenant_id)
        if not segment:
            raise NotFound("Segment not found.")
        # get child chunk
        child_chunk = SegmentService.get_child_chunk_by_id(
            child_chunk_id=child_chunk_id, tenant_id=current_user.current_tenant_id
        )
        if not child_chunk:
            raise NotFound("Child chunk not found.")
        # validate args
        parser = reqparse.RequestParser()
        parser.add_argument("content", type=str, required=True, nullable=False, location="json")
        args = parser.parse_args()
        try:
            child_chunk = SegmentService.update_child_chunk(
                args.get("content"), child_chunk, segment, document, dataset
            )
        except ChildChunkIndexingServiceError as e:
            raise ChildChunkIndexingError(str(e))
        return {"data": marshal(child_chunk, child_chunk_fields)}, 200
 api.add_resource(SegmentApi, "/datasets/<uuid:dataset_id>/documents/<uuid:document_id>/segments")
 api.add_resource(
    DatasetSegmentApi, "/datasets/<uuid:dataset_id>/documents/<uuid:document_id>/segments/<uuid:segment_id>"
 )
 api.add_resource(
    ChildChunkApi, "/datasets/<uuid:dataset_id>/documents/<uuid:document_id>/segments/<uuid:segment_id>/child_chunks"
 )
 api.add_resource(
    DatasetChildChunkApi,
    "/datasets/<uuid:dataset_id>/documents/<uuid:document_id>/segments/<uuid:segment_id>/child_chunks/<uuid:child_chunk_id>",
 )
--- a/api/core/agent/base_agent_runner.py
+++ b/api/core/agent/base_agent_runner.py
@ -332,7 +332,7 @@ class BaseAgentRunner(AppRunner):
        agent_thought = updated_agent_thought
        if thought:
-            agent_thought.thought = thought
+            agent_thought.thought += thought
        if tool_name:
            agent_thought.tool = tool_name
--- a/api/core/app/app_config/easy_ui_based_app/model_config/converter.py
+++ b/api/core/app/app_config/easy_ui_based_app/model_config/converter.py
@ -16,7 +16,6 @@ class ModelConfigConverter:
        """
        Convert app model config dict to entity.
        :param app_config: app config
        :param skip_check: skip check
        :raises ProviderTokenNotInitError: provider token not init error
        :return: app orchestration config entity
        """
--- a/api/core/app/apps/advanced_chat/app_generator.py
+++ b/api/core/app/apps/advanced_chat/app_generator.py
@ -88,7 +88,7 @@ class AdvancedChatAppGenerator(MessageBasedAppGenerator):
        :param user: account or end user
        :param args: request args
        :param invoke_from: invoke from source
-        :param stream: is stream
+        :param streaming: is stream
        """
        if not args.get("query"):
            raise ValueError("query is required")
@ -181,10 +181,10 @@ class AdvancedChatAppGenerator(MessageBasedAppGenerator):
        :param app_model: App
        :param workflow: Workflow
        :param node_id: the node id
        :param user: account or end user
        :param args: request args
-        :param invoke_from: invoke from source
+        :param streaming: is streamed
        :param stream: is stream
        """
        if not node_id:
            raise ValueError("node_id is required")
@ -238,10 +238,10 @@ class AdvancedChatAppGenerator(MessageBasedAppGenerator):
        :param app_model: App
        :param workflow: Workflow
        :param node_id: the node id
        :param user: account or end user
        :param args: request args
-        :param invoke_from: invoke from source
+        :param streaming: is stream
        :param stream: is stream
        """
        if not node_id:
            raise ValueError("node_id is required")
--- a/api/core/app/apps/agent_chat/app_generator.py
+++ b/api/core/app/apps/agent_chat/app_generator.py
@ -80,7 +80,7 @@ class AgentChatAppGenerator(MessageBasedAppGenerator):
        :param user: account or end user
        :param args: request args
        :param invoke_from: invoke from source
-        :param stream: is stream
+        :param streaming: is stream
        """
        if not streaming:
            raise ValueError("Agent Chat App does not support blocking mode")
--- a/api/core/app/apps/base_app_runner.py
+++ b/api/core/app/apps/base_app_runner.py
@ -157,6 +157,7 @@ class AppRunner:
        :param files: files
        :param query: query
        :param memory: memory
        :param image_detail_config: the image quality config
        :return:
        """
        # get prompt without memory and context
--- a/api/core/app/apps/chat/app_generator.py
+++ b/api/core/app/apps/chat/app_generator.py
@ -76,7 +76,7 @@ class ChatAppGenerator(MessageBasedAppGenerator):
        :param user: account or end user
        :param args: request args
        :param invoke_from: invoke from source
-        :param stream: is stream
+        :param streaming: is stream
        """
        if not args.get("query"):
            raise ValueError("query is required")
--- a/api/core/app/apps/completion/app_generator.py
+++ b/api/core/app/apps/completion/app_generator.py
@ -74,7 +74,7 @@ class CompletionAppGenerator(MessageBasedAppGenerator):
        :param user: account or end user
        :param args: request args
        :param invoke_from: invoke from source
-        :param stream: is stream
+        :param streaming: is stream
        """
        query = args["query"]
        if not isinstance(query, str):
--- a/api/core/app/apps/message_based_app_generator.py
+++ b/api/core/app/apps/message_based_app_generator.py
@ -148,6 +148,13 @@ class MessageBasedAppGenerator(BaseAppGenerator):
        # get conversation introduction
        introduction = self._get_conversation_introduction(application_generate_entity)
        # get conversation name
        if isinstance(application_generate_entity, AdvancedChatAppGenerateEntity):
            query = application_generate_entity.query or "New conversation"
        else:
            query = next(iter(application_generate_entity.inputs.values()), "New conversation")
        conversation_name = (query[:20] + "…") if len(query) > 20 else query
        if not conversation:
            conversation = Conversation(
                app_id=app_config.app_id,
@ -156,7 +163,7 @@ class MessageBasedAppGenerator(BaseAppGenerator):
                model_id=model_id,
                override_model_configs=json.dumps(override_model_configs) if override_model_configs else None,
                mode=app_config.app_mode.value,
-                name="New conversation",
+                name=conversation_name,
                inputs=application_generate_entity.inputs,
                introduction=introduction,
                system_instruction="",
--- a/api/core/app/apps/workflow/app_generator.py
+++ b/api/core/app/apps/workflow/app_generator.py
@ -158,7 +158,7 @@ class WorkflowAppGenerator(BaseAppGenerator):
        :param user: account or end user
        :param application_generate_entity: application generate entity
        :param invoke_from: invoke from source
-        :param stream: is stream
+        :param streaming: is stream
        :param workflow_thread_pool_id: workflow thread pool id
        """
        # init queue manager
@ -208,10 +208,10 @@ class WorkflowAppGenerator(BaseAppGenerator):
        :param app_model: App
        :param workflow: Workflow
        :param node_id: the node id
        :param user: account or end user
        :param args: request args
-        :param invoke_from: invoke from source
+        :param streaming: is streamed
        :param stream: is stream
        """
        if not node_id:
            raise ValueError("node_id is required")
@ -264,10 +264,10 @@ class WorkflowAppGenerator(BaseAppGenerator):
        :param app_model: App
        :param workflow: Workflow
        :param node_id: the node id
        :param user: account or end user
        :param args: request args
-        :param invoke_from: invoke from source
+        :param streaming: is streamed
        :param stream: is stream
        """
        if not node_id:
            raise ValueError("node_id is required")
--- a/api/core/app/apps/workflow/app_runner.py
+++ b/api/core/app/apps/workflow/app_runner.py
@ -44,9 +44,6 @@ class WorkflowAppRunner(WorkflowBasedAppRunner):
    def run(self) -> None:
        """
        Run application
        :param application_generate_entity: application generate entity
        :param queue_manager: application queue manager
        :return:
        """
        app_config = self.application_generate_entity.app_config
        app_config = cast(WorkflowAppConfig, app_config)
--- a/api/core/app/task_pipeline/message_cycle_manage.py
+++ b/api/core/app/task_pipeline/message_cycle_manage.py
@ -48,7 +48,7 @@ class MessageCycleManage:
    def _generate_conversation_name(self, *, conversation_id: str, query: str) -> Optional[Thread]:
        """
        Generate conversation name.
-        :param conversation: conversation
+        :param conversation_id: conversation id
        :param query: query
        :return: thread
        """
--- a/api/core/app/task_pipeline/workflow_cycle_manage.py
+++ b/api/core/app/task_pipeline/workflow_cycle_manage.py
@ -44,6 +44,7 @@ from core.app.entities.task_entities import (
    WorkflowFinishStreamResponse,
    WorkflowStartStreamResponse,
 )
 from core.app.task_pipeline.exc import WorkflowRunNotFoundError
 from core.file import FILE_MODEL_IDENTITY, File
 from core.model_runtime.utils.encoders import jsonable_encoder
 from core.ops.entities.trace_entity import TraceTaskName
@ -66,8 +67,6 @@ from models.workflow import (
    WorkflowRunStatus,
 )
 from .exc import WorkflowRunNotFoundError
 class WorkflowCycleManage:
    def __init__(
@ -154,7 +153,7 @@ class WorkflowCycleManage:
    ) -> WorkflowRun:
        """
        Workflow run success
-        :param workflow_run: workflow run
+        :param workflow_run_id: workflow run id
        :param start_at: start time
        :param total_tokens: total tokens
        :param total_steps: total steps
@ -166,7 +165,7 @@ class WorkflowCycleManage:
        outputs = WorkflowEntry.handle_special_values(outputs)
-        workflow_run.status = WorkflowRunStatus.SUCCEEDED.value
+        workflow_run.status = WorkflowRunStatus.SUCCEEDED
        workflow_run.outputs = json.dumps(outputs or {})
        workflow_run.elapsed_time = time.perf_counter() - start_at
        workflow_run.total_tokens = total_tokens
@ -201,7 +200,7 @@ class WorkflowCycleManage:
        workflow_run = self._get_workflow_run(session=session, workflow_run_id=workflow_run_id)
        outputs = WorkflowEntry.handle_special_values(dict(outputs) if outputs else None)
-        workflow_run.status = WorkflowRunStatus.PARTIAL_SUCCESSED.value
+        workflow_run.status = WorkflowRunStatus.PARTIAL_SUCCEEDED.value
        workflow_run.outputs = json.dumps(outputs or {})
        workflow_run.elapsed_time = time.perf_counter() - start_at
        workflow_run.total_tokens = total_tokens
@ -237,7 +236,7 @@ class WorkflowCycleManage:
    ) -> WorkflowRun:
        """
        Workflow run failed
-        :param workflow_run: workflow run
+        :param workflow_run_id: workflow run id
        :param start_at: start time
        :param total_tokens: total tokens
        :param total_steps: total steps
--- a/api/core/file/upload_file_parser.py
+++ b/api/core/file/upload_file_parser.py
@ -4,12 +4,10 @@ import time
 from typing import Optional
 from configs import dify_config
 from constants import IMAGE_EXTENSIONS
 from core.helper.url_signer import UrlSigner
 from extensions.ext_storage import storage
 IMAGE_EXTENSIONS = ["jpg", "jpeg", "png", "webp", "gif", "svg"]
 IMAGE_EXTENSIONS.extend([ext.upper() for ext in IMAGE_EXTENSIONS])
 class UploadFileParser:
    @classmethod
@ -38,7 +36,7 @@ class UploadFileParser:
        """
        get signed url from upload file
-        :param upload_file: UploadFile object
+        :param upload_file_id: the id of UploadFile object
        :return:
        """
        base_url = dify_config.FILES_URL
--- a/api/core/helper/code_executor/code_executor.py
+++ b/api/core/helper/code_executor/code_executor.py
@ -60,6 +60,7 @@ class CodeExecutor:
        """
        Execute code
        :param language: code language
        :param preload: the preload script
        :param code: code
        :return:
        """
--- a/api/core/helper/position_helper.py
+++ b/api/core/helper/position_helper.py
@ -53,7 +53,7 @@ def pin_position_map(original_position_map: dict[str, int], pin_list: list[str])
    """
    Pin the items in the pin list to the beginning of the position map.
    Overall logic: exclude > include > pin
-    :param position_map: the position map to be sorted and filtered
+    :param original_position_map: the position map to be sorted and filtered
    :param pin_list: the list of pins to be put at the beginning
    :return: the sorted position map
    """
--- a/api/core/helper/tool_parameter_cache.py
+++ b/api/core/helper/tool_parameter_cache.py
@ -38,12 +38,7 @@ class ToolParameterCache:
            return None
    def set(self, parameters: dict) -> None:
-        """
+        """Cache model provider credentials."""
        Cache model provider credentials.
        :param credentials: provider credentials
        :return:
        """
        redis_client.setex(self.cache_key, 86400, json.dumps(parameters))
    def delete(self) -> None:
--- a/api/core/indexing_runner.py
+++ b/api/core/indexing_runner.py
@ -187,7 +187,7 @@ class IndexingRunner:
                            },
                        )
                        if dataset_document.doc_form == IndexType.PARENT_CHILD_INDEX:
-                            child_chunks = document_segment.child_chunks
+                            child_chunks = document_segment.get_child_chunks()
                            if child_chunks:
                                child_documents = []
                                for child_chunk in child_chunks:
--- a/api/core/llm_generator/prompts.py
+++ b/api/core/llm_generator/prompts.py
@ -1,6 +1,6 @@
 # Written by YORKI MINAKO🤡, Edited by Xiaoyi
 CONVERSATION_TITLE_PROMPT = """You need to decompose the user's input into "subject" and "intention" in order to accurately figure out what the user's input language actually is. 
-Notice: the language type user use could be diverse, which can be English, Chinese, Español, Arabic, Japanese, French, and etc.
+Notice: the language type user use could be diverse, which can be English, Chinese, Italian, Español, Arabic, Japanese, French, and etc.
 MAKE SURE your output is the SAME language as the user's input!
 Your output is restricted only to: (Input language) Intention + Subject(short as possible)
 Your output MUST be a valid JSON.
--- a/api/core/model_runtime/model_providers/__base/tts_model.py
+++ b/api/core/model_runtime/model_providers/__base/tts_model.py
@ -38,7 +38,6 @@ class TTSModel(AIModel):
        :param credentials: model credentials
        :param voice: model timbre
        :param content_text: text content to be translated
        :param streaming: output is streaming
        :param user: unique user id
        :return: translated audio file
        """
--- a/api/core/model_runtime/model_providers/openai/moderation/moderation.py
+++ b/api/core/model_runtime/model_providers/openai/moderation/moderation.py
@ -1,170 +0,0 @@
 from collections.abc import Mapping
 from typing import Optional
 import openai
 from httpx import Timeout
 from openai import OpenAI
 from openai.types import ModerationCreateResponse
 from core.model_runtime.entities.model_entities import ModelPropertyKey
 from core.model_runtime.errors.invoke import (
    InvokeAuthorizationError,
    InvokeBadRequestError,
    InvokeConnectionError,
    InvokeError,
    InvokeRateLimitError,
    InvokeServerUnavailableError,
 )
 from core.model_runtime.errors.validate import CredentialsValidateFailedError
 from core.model_runtime.model_providers.__base.moderation_model import ModerationModel
 class OpenAIModerationModel(ModerationModel):
    """
    Model class for OpenAI text moderation model.
    """
    def _invoke(self, model: str, credentials: dict, text: str, user: Optional[str] = None) -> bool:
        """
        Invoke moderation model
        :param model: model name
        :param credentials: model credentials
        :param text: text to moderate
        :param user: unique user id
        :return: false if text is safe, true otherwise
        """
        # transform credentials to kwargs for model instance
        credentials_kwargs = self._to_credential_kwargs(credentials)
        # init model client
        client = OpenAI(**credentials_kwargs)
        # chars per chunk
        length = self._get_max_characters_per_chunk(model, credentials)
        text_chunks = [text[i : i + length] for i in range(0, len(text), length)]
        max_text_chunks = self._get_max_chunks(model, credentials)
        chunks = [text_chunks[i : i + max_text_chunks] for i in range(0, len(text_chunks), max_text_chunks)]
        for text_chunk in chunks:
            moderation_result = self._moderation_invoke(model=model, client=client, texts=text_chunk)
            for result in moderation_result.results:
                if result.flagged is True:
                    return True
        return False
    def validate_credentials(self, model: str, credentials: dict) -> None:
        """
        Validate model credentials
        :param model: model name
        :param credentials: model credentials
        :return:
        """
        try:
            # transform credentials to kwargs for model instance
            credentials_kwargs = self._to_credential_kwargs(credentials)
            client = OpenAI(**credentials_kwargs)
            # call moderation model
            self._moderation_invoke(
                model=model,
                client=client,
                texts=["ping"],
            )
        except Exception as ex:
            raise CredentialsValidateFailedError(str(ex))
    def _moderation_invoke(self, model: str, client: OpenAI, texts: list[str]) -> ModerationCreateResponse:
        """
        Invoke moderation model
        :param model: model name
        :param client: model client
        :param texts: texts to moderate
        :return: false if text is safe, true otherwise
        """
        # call moderation model
        moderation_result = client.moderations.create(model=model, input=texts)
        return moderation_result
    def _get_max_characters_per_chunk(self, model: str, credentials: dict) -> int:
        """
        Get max characters per chunk
        :param model: model name
        :param credentials: model credentials
        :return: max characters per chunk
        """
        model_schema = self.get_model_schema(model, credentials)
        if model_schema and ModelPropertyKey.MAX_CHARACTERS_PER_CHUNK in model_schema.model_properties:
            max_characters_per_chunk: int = model_schema.model_properties[ModelPropertyKey.MAX_CHARACTERS_PER_CHUNK]
            return max_characters_per_chunk
        return 2000
    def _get_max_chunks(self, model: str, credentials: dict) -> int:
        """
        Get max chunks for given embedding model
        :param model: model name
        :param credentials: model credentials
        :return: max chunks
        """
        model_schema = self.get_model_schema(model, credentials)
        if model_schema and ModelPropertyKey.MAX_CHUNKS in model_schema.model_properties:
            max_chunks: int = model_schema.model_properties[ModelPropertyKey.MAX_CHUNKS]
            return max_chunks
        return 1
    def _to_credential_kwargs(self, credentials: Mapping) -> dict:
        """
        Transform credentials to kwargs for model instance
        :param credentials:
        :return:
        """
        credentials_kwargs = {
            "api_key": credentials["openai_api_key"],
            "timeout": Timeout(315.0, read=300.0, write=10.0, connect=5.0),
            "max_retries": 1,
        }
        if credentials.get("openai_api_base"):
            openai_api_base = credentials["openai_api_base"].rstrip("/")
            credentials_kwargs["base_url"] = openai_api_base + "/v1"
        if "openai_organization" in credentials:
            credentials_kwargs["organization"] = credentials["openai_organization"]
        return credentials_kwargs
    @property
    def _invoke_error_mapping(self) -> dict[type[InvokeError], list[type[Exception]]]:
        """
        Map model invoke error to unified error
        The key is the error type thrown to the caller
        The value is the error type thrown by the model,
        which needs to be converted into a unified error type for the caller.
        :return: Invoke error mapping
        """
        return {
            InvokeConnectionError: [openai.APIConnectionError, openai.APITimeoutError],
            InvokeServerUnavailableError: [openai.InternalServerError],
            InvokeRateLimitError: [openai.RateLimitError],
            InvokeAuthorizationError: [openai.AuthenticationError, openai.PermissionDeniedError],
            InvokeBadRequestError: [
                openai.BadRequestError,
                openai.NotFoundError,
                openai.UnprocessableEntityError,
                openai.APIError,
            ],
        }
--- a/api/core/model_runtime/model_providers/vertex_ai/llm/_position.yaml
+++ b/api/core/model_runtime/model_providers/vertex_ai/llm/_position.yaml
@ -1,22 +0,0 @@
 - claude-3-haiku@20240307
 - claude-3-opus@20240229
 - claude-3-sonnet@20240229
 - claude-3-5-sonnet-v2@20241022
 - claude-3-5-sonnet@20240620
 - gemini-1.0-pro-vision-001
 - gemini-1.0-pro-002
 - gemini-1.5-flash-001
 - gemini-1.5-flash-002
 - gemini-1.5-pro-001
 - gemini-1.5-pro-002
 - gemini-2.0-flash-001
 - gemini-2.0-flash-exp
 - gemini-2.0-flash-lite-preview-02-05
 - gemini-2.0-flash-thinking-exp-01-21
 - gemini-2.0-flash-thinking-exp-1219
 - gemini-2.0-pro-exp-02-05
 - gemini-exp-1114
 - gemini-exp-1121
 - gemini-exp-1206
 - gemini-flash-experimental
 - gemini-pro-experimental
--- a/api/core/model_runtime/model_providers/vertex_ai/llm/gemini-2.0-flash-001.yaml
+++ b/api/core/model_runtime/model_providers/vertex_ai/llm/gemini-2.0-flash-001.yaml
@ -1,41 +0,0 @@
 model: gemini-2.0-flash-001
 label:
  en_US: Gemini 2.0 Flash 001
 model_type: llm
 features:
  - agent-thought
  - vision
  - tool-call
  - stream-tool-call
  - document
  - video
  - audio
 model_properties:
  mode: chat
  context_size: 1048576
 parameter_rules:
  - name: temperature
    use_template: temperature
  - name: top_p
    use_template: top_p
  - name: top_k
    label:
      zh_Hans: 取样数量
      en_US: Top k
    type: int
    help:
      zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
      en_US: Only sample from the top K options for each subsequent token.
    required: false
  - name: max_output_tokens
    use_template: max_tokens
    default: 8192
    min: 1
    max: 8192
  - name: json_schema
    use_template: json_schema
 pricing:
  input: '0.00'
  output: '0.00'
  unit: '0.000001'
  currency: USD
--- a/api/core/model_runtime/model_providers/vertex_ai/llm/gemini-2.0-flash-lite-preview-02-05.yaml
+++ b/api/core/model_runtime/model_providers/vertex_ai/llm/gemini-2.0-flash-lite-preview-02-05.yaml
@ -1,41 +0,0 @@
 model: gemini-2.0-flash-lite-preview-02-05
 label:
  en_US: Gemini 2.0 Flash Lite Preview 0205
 model_type: llm
 features:
  - agent-thought
  - vision
  - tool-call
  - stream-tool-call
  - document
  - video
  - audio
 model_properties:
  mode: chat
  context_size: 1048576
 parameter_rules:
  - name: temperature
    use_template: temperature
  - name: top_p
    use_template: top_p
  - name: top_k
    label:
      zh_Hans: 取样数量
      en_US: Top k
    type: int
    help:
      zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
      en_US: Only sample from the top K options for each subsequent token.
    required: false
  - name: max_output_tokens
    use_template: max_tokens
    default: 8192
    min: 1
    max: 8192
  - name: json_schema
    use_template: json_schema
 pricing:
  input: '0.00'
  output: '0.00'
  unit: '0.000001'
  currency: USD
--- a/api/core/model_runtime/model_providers/vertex_ai/llm/gemini-2.0-flash-thinking-exp-01-21.yaml
+++ b/api/core/model_runtime/model_providers/vertex_ai/llm/gemini-2.0-flash-thinking-exp-01-21.yaml
@ -1,39 +0,0 @@
 model: gemini-2.0-flash-thinking-exp-01-21
 label:
  en_US: Gemini 2.0 Flash Thinking Exp 0121
 model_type: llm
 features:
  - agent-thought
  - vision
  - document
  - video
  - audio
 model_properties:
  mode: chat
  context_size: 32767
 parameter_rules:
  - name: temperature
    use_template: temperature
  - name: top_p
    use_template: top_p
  - name: top_k
    label:
      zh_Hans: 取样数量
      en_US: Top k
    type: int
    help:
      zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
      en_US: Only sample from the top K options for each subsequent token.
    required: false
  - name: max_output_tokens
    use_template: max_tokens
    default: 8192
    min: 1
    max: 8192
  - name: json_schema
    use_template: json_schema
 pricing:
  input: '0.00'
  output: '0.00'
  unit: '0.000001'
  currency: USD
--- a/api/core/model_runtime/model_providers/vertex_ai/llm/gemini-2.0-flash-thinking-exp-1219.yaml
+++ b/api/core/model_runtime/model_providers/vertex_ai/llm/gemini-2.0-flash-thinking-exp-1219.yaml
@ -1,39 +0,0 @@
 model: gemini-2.0-flash-thinking-exp-1219
 label:
  en_US: Gemini 2.0 Flash Thinking Exp 1219
 model_type: llm
 features:
  - agent-thought
  - vision
  - document
  - video
  - audio
 model_properties:
  mode: chat
  context_size: 32767
 parameter_rules:
  - name: temperature
    use_template: temperature
  - name: top_p
    use_template: top_p
  - name: top_k
    label:
      zh_Hans: 取样数量
      en_US: Top k
    type: int
    help:
      zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
      en_US: Only sample from the top K options for each subsequent token.
    required: false
  - name: max_output_tokens
    use_template: max_tokens
    default: 8192
    min: 1
    max: 8192
  - name: json_schema
    use_template: json_schema
 pricing:
  input: '0.00'
  output: '0.00'
  unit: '0.000001'
  currency: USD
--- a/api/core/model_runtime/model_providers/vertex_ai/llm/gemini-2.0-pro-exp-02-05.yaml
+++ b/api/core/model_runtime/model_providers/vertex_ai/llm/gemini-2.0-pro-exp-02-05.yaml
@ -1,37 +0,0 @@
 model: gemini-2.0-pro-exp-02-05
 label:
  en_US: Gemini 2.0 Pro Exp 0205
 model_type: llm
 features:
  - agent-thought
  - document
 model_properties:
  mode: chat
  context_size: 2000000
 parameter_rules:
  - name: temperature
    use_template: temperature
  - name: top_p
    use_template: top_p
  - name: top_k
    label:
      en_US: Top k
    type: int
    help:
      en_US: Only sample from the top K options for each subsequent token.
    required: false
  - name: presence_penalty
    use_template: presence_penalty
  - name: frequency_penalty
    use_template: frequency_penalty
  - name: max_output_tokens
    use_template: max_tokens
    required: true
    default: 8192
    min: 1
    max: 8192
 pricing:
  input: '0.00'
  output: '0.00'
  unit: '0.000001'
  currency: USD
--- a/api/core/model_runtime/model_providers/vertex_ai/llm/gemini-exp-1114.yaml
+++ b/api/core/model_runtime/model_providers/vertex_ai/llm/gemini-exp-1114.yaml
@ -1,41 +0,0 @@
 model: gemini-exp-1114
 label:
  en_US: Gemini exp 1114
 model_type: llm
 features:
  - agent-thought
  - vision
  - tool-call
  - stream-tool-call
  - document
  - video
  - audio
 model_properties:
  mode: chat
  context_size: 32767
 parameter_rules:
  - name: temperature
    use_template: temperature
  - name: top_p
    use_template: top_p
  - name: top_k
    label:
      zh_Hans: 取样数量
      en_US: Top k
    type: int
    help:
      zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
      en_US: Only sample from the top K options for each subsequent token.
    required: false
  - name: max_output_tokens
    use_template: max_tokens
    default: 8192
    min: 1
    max: 8192
  - name: json_schema
    use_template: json_schema
 pricing:
  input: '0.00'
  output: '0.00'
  unit: '0.000001'
  currency: USD
--- a/api/core/model_runtime/model_providers/vertex_ai/llm/gemini-exp-1121.yaml
+++ b/api/core/model_runtime/model_providers/vertex_ai/llm/gemini-exp-1121.yaml
@ -1,41 +0,0 @@
 model: gemini-exp-1121
 label:
  en_US: Gemini exp 1121
 model_type: llm
 features:
  - agent-thought
  - vision
  - tool-call
  - stream-tool-call
  - document
  - video
  - audio
 model_properties:
  mode: chat
  context_size: 32767
 parameter_rules:
  - name: temperature
    use_template: temperature
  - name: top_p
    use_template: top_p
  - name: top_k
    label:
      zh_Hans: 取样数量
      en_US: Top k
    type: int
    help:
      zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
      en_US: Only sample from the top K options for each subsequent token.
    required: false
  - name: max_output_tokens
    use_template: max_tokens
    default: 8192
    min: 1
    max: 8192
  - name: json_schema
    use_template: json_schema
 pricing:
  input: '0.00'
  output: '0.00'
  unit: '0.000001'
  currency: USD
--- a/api/core/model_runtime/model_providers/vertex_ai/llm/gemini-exp-1206.yaml
+++ b/api/core/model_runtime/model_providers/vertex_ai/llm/gemini-exp-1206.yaml
@ -1,41 +0,0 @@
 model: gemini-exp-1206
 label:
  en_US: Gemini exp 1206
 model_type: llm
 features:
  - agent-thought
  - vision
  - tool-call
  - stream-tool-call
  - document
  - video
  - audio
 model_properties:
  mode: chat
  context_size: 2097152
 parameter_rules:
  - name: temperature
    use_template: temperature
  - name: top_p
    use_template: top_p
  - name: top_k
    label:
      zh_Hans: 取样数量
      en_US: Top k
    type: int
    help:
      zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
      en_US: Only sample from the top K options for each subsequent token.
    required: false
  - name: max_output_tokens
    use_template: max_tokens
    default: 8192
    min: 1
    max: 8192
  - name: json_schema
    use_template: json_schema
 pricing:
  input: '0.00'
  output: '0.00'
  unit: '0.000001'
  currency: USD
--- a/api/core/model_runtime/model_providers/zhipuai/llm/glm-4-air-0111.yaml
+++ b/api/core/model_runtime/model_providers/zhipuai/llm/glm-4-air-0111.yaml
@ -1,66 +0,0 @@
 model: glm-4-air-0111
 label:
  en_US: glm-4-air-0111
 model_type: llm
 features:
  - multi-tool-call
  - agent-thought
  - stream-tool-call
 model_properties:
  mode: chat
  context_size: 131072
 parameter_rules:
  - name: temperature
    use_template: temperature
    default: 0.95
    min: 0.0
    max: 1.0
    help:
      zh_Hans: 采样温度，控制输出的随机性，必须为正数取值范围是：(0.0,1.0]，不能等于 0,默认值为 0.95 值越大，会使输出更随机，更具创造性；值越小，输出会更加稳定或确定建议您根据应用场景调整 top_p 或 temperature 参数，但不要同时调整两个参数。
      en_US: Sampling temperature, controls the randomness of the output, must be a positive number. The value range is (0.0,1.0], which cannot be equal to 0. The default value is 0.95. The larger the value, the more random and creative the output will be; the smaller the value, The output will be more stable or certain. It is recommended that you adjust the top_p or temperature parameters according to the application scenario, but do not adjust both parameters at the same time.
  - name: top_p
    use_template: top_p
    default: 0.7
    help:
      zh_Hans: 用温度取样的另一种方法，称为核取样取值范围是：(0.0, 1.0) 开区间，不能等于 0 或 1，默认值为 0.7 模型考虑具有 top_p 概率质量tokens的结果例如：0.1 意味着模型解码器只考虑从前 10% 的概率的候选集中取 tokens 建议您根据应用场景调整 top_p 或 temperature 参数，但不要同时调整两个参数。
      en_US: Another method of temperature sampling is called kernel sampling. The value range is (0.0, 1.0) open interval, which cannot be equal to 0 or 1. The default value is 0.7. The model considers the results with top_p probability mass tokens. For example 0.1 means The model decoder only considers tokens from the candidate set with the top 10% probability. It is recommended that you adjust the top_p or temperature parameters according to the application scenario, but do not adjust both parameters at the same time.
  - name: do_sample
    label:
      zh_Hans: 采样策略
      en_US: Sampling strategy
    type: boolean
    help:
      zh_Hans: do_sample 为 true 时启用采样策略，do_sample 为 false 时采样策略 temperature、top_p 将不生效。默认值为 true。
      en_US: When `do_sample` is set to true, the sampling strategy is enabled. When `do_sample` is set to false, the sampling strategies such as `temperature` and `top_p` will not take effect. The default value is true.
    default: true
  - name: max_tokens
    use_template: max_tokens
    default: 1024
    min: 1
    max: 4095
  - name: web_search
    type: boolean
    label:
      zh_Hans: 联网搜索
      en_US: Web Search
    default: false
    help:
      zh_Hans: 模型内置了互联网搜索服务，该参数控制模型在生成文本时是否参考使用互联网搜索结果。启用互联网搜索，模型会将搜索结果作为文本生成过程中的参考信息，但模型会基于其内部逻辑“自行判断”是否使用互联网搜索结果。
      en_US: The model has a built-in Internet search service. This parameter controls whether the model refers to Internet search results when generating text. When Internet search is enabled, the model will use the search results as reference information in the text generation process, but the model will "judge" whether to use Internet search results based on its internal logic.
  - name: response_format
    label:
      zh_Hans: 回复格式
      en_US: Response Format
    type: string
    help:
      zh_Hans: 指定模型必须输出的格式
      en_US: specifying the format that the model must output
    required: false
    options:
      - text
      - json_object
 pricing:
  input: '0.0005'
  output: '0.0005'
  unit: '0.001'
  currency: RMB
--- a/api/core/ops/ops_trace_manager.py
+++ b/api/core/ops/ops_trace_manager.py
@ -8,6 +8,7 @@ from datetime import timedelta
 from typing import Any, Optional, Union
 from uuid import UUID, uuid4
 from cachetools import LRUCache
 from flask import current_app
 from sqlalchemy import select
 from sqlalchemy.orm import Session
@ -70,6 +71,8 @@ provider_config_map: dict[str, dict[str, Any]] = {
 class OpsTraceManager:
    ops_trace_instances_cache: LRUCache = LRUCache(maxsize=128)
    @classmethod
    def encrypt_tracing_config(
        cls, tenant_id: str, tracing_provider: str, tracing_config: dict, current_trace_config=None
@ -204,29 +207,33 @@ class OpsTraceManager:
            return None
        app_ops_trace_config = json.loads(app.tracing) if app.tracing else None
        if app_ops_trace_config is None:
            return None
        if not app_ops_trace_config.get("enabled"):
            return None
        tracing_provider = app_ops_trace_config.get("tracing_provider")
        if tracing_provider is None or tracing_provider not in provider_config_map:
            return None
        # decrypt_token
        decrypt_trace_config = cls.get_decrypted_tracing_config(app_id, tracing_provider)
-        if app_ops_trace_config.get("enabled"):
+        if not decrypt_trace_config:
            return None
        trace_instance, config_class = (
            provider_config_map[tracing_provider]["trace_instance"],
            provider_config_map[tracing_provider]["config_class"],
        )
-            if not decrypt_trace_config:
+        decrypt_trace_config_key = str(decrypt_trace_config)
-                return None
+        tracing_instance = cls.ops_trace_instances_cache.get(decrypt_trace_config_key)
        if tracing_instance is None:
            # create new tracing_instance and update the cache if it absent
            tracing_instance = trace_instance(config_class(**decrypt_trace_config))
            cls.ops_trace_instances_cache[decrypt_trace_config_key] = tracing_instance
            logging.info(f"new tracing_instance for app_id: {app_id}")
        return tracing_instance
        return None
    @classmethod
    def get_app_config_through_message_id(cls, message_id: str):
        app_model_config = None
--- a/api/core/rag/datasource/retrieval_service.py
+++ b/api/core/rag/datasource/retrieval_service.py
@ -97,6 +97,7 @@ class RetrievalService:
                        all_documents=all_documents,
                        retrieval_method=retrieval_method,
                        exceptions=exceptions,
                        document_ids_filter=document_ids_filter,
                    )
                )
            concurrent.futures.wait(futures, timeout=30, return_when=concurrent.futures.ALL_COMPLETED)
@ -222,6 +223,7 @@ class RetrievalService:
        all_documents: list,
        retrieval_method: str,
        exceptions: list,
        document_ids_filter: Optional[list[str]] = None,
    ):
        with flask_app.app_context():
            try:
@ -231,7 +233,9 @@ class RetrievalService:
                vector_processor = Vector(dataset=dataset)
-                documents = vector_processor.search_by_full_text(cls.escape_query_for_search(query), top_k=top_k)
+                documents = vector_processor.search_by_full_text(
                    cls.escape_query_for_search(query), top_k=top_k, document_ids_filter=document_ids_filter
                )
                if documents:
                    if (
                        reranking_model
--- a/api/core/rag/datasource/vdb/lindorm/lindorm_vector.py
+++ b/api/core/rag/datasource/vdb/lindorm/lindorm_vector.py
@ -102,8 +102,6 @@ class LindormVectorStore(BaseVector):
        if response["errors"]:
            for item in response["items"]:
                print(f"{item['index']['status']}: {item['index']['error']['type']}")
        else:
            self.refresh()
    def get_ids_by_metadata_field(self, key: str, value: str):
        query: dict[str, Any] = {
@ -167,7 +165,7 @@ class LindormVectorStore(BaseVector):
        if not all(isinstance(x, float) for x in query_vector):
            raise ValueError("All elements in query_vector should be floats")
-        top_k = kwargs.get("top_k", 10)
+        top_k = kwargs.get("top_k", 3)
        document_ids_filter = kwargs.get("document_ids_filter")
        filters = []
        if document_ids_filter:
@ -210,7 +208,7 @@ class LindormVectorStore(BaseVector):
        must_not = kwargs.get("must_not")
        should = kwargs.get("should")
        minimum_should_match = kwargs.get("minimum_should_match", 0)
-        top_k = kwargs.get("top_k", 10)
+        top_k = kwargs.get("top_k", 3)
        filters = kwargs.get("filter", [])
        document_ids_filter = kwargs.get("document_ids_filter")
        if document_ids_filter:
@ -295,7 +293,7 @@ class LindormVectorStore(BaseVector):
 def default_text_mapping(dimension: int, method_name: str, **kwargs: Any) -> dict:
-    excludes_from_source = kwargs.get("excludes_from_source")
+    excludes_from_source = kwargs.get("excludes_from_source", False)
    analyzer = kwargs.get("analyzer", "ik_max_word")
    text_field = kwargs.get("text_field", Field.CONTENT_KEY.value)
    engine = kwargs["engine"]
@ -356,12 +354,12 @@ def default_text_mapping(dimension: int, method_name: str, **kwargs: Any) -> dic
    if excludes_from_source:
        # e.g. {"excludes": ["vector_field"]}
-        mapping["mappings"]["_source"] = {"excludes": excludes_from_source}
+        mapping["mappings"]["_source"] = {"excludes": [vector_field]}
    if using_ugc and method_name == "ivfpq":
        mapping["settings"]["index"]["knn_routing"] = True
        mapping["settings"]["index"]["knn.offline.construction"] = True
-    elif using_ugc and method_name == "hnsw" or using_ugc and method_name == "flat":
+    elif (using_ugc and method_name == "hnsw") or (using_ugc and method_name == "flat"):
        mapping["settings"]["index"]["knn_routing"] = True
    return mapping
@ -458,7 +456,7 @@ def default_vector_search_query(
        "query": {"knn": {vector_field: {"vector": query_vector, "k": k}}},
    }
-    if filters is not None:
+    if filters is not None and len(filters) > 0:
        # when using filter, transform filter from List[Dict] to Dict as valid format
        filter_dict = {"bool": {"must": filters}} if len(filters) > 1 else filters[0]
        search_query["query"]["knn"][vector_field]["filter"] = filter_dict  # filter should be Dict
--- a/api/core/rag/datasource/vdb/milvus/milvus_vector.py
+++ b/api/core/rag/datasource/vdb/milvus/milvus_vector.py
@ -231,8 +231,8 @@ class MilvusVector(BaseVector):
        document_ids_filter = kwargs.get("document_ids_filter")
        filter = ""
        if document_ids_filter:
-            document_ids = ", ".join(f"'{id}'" for id in document_ids_filter)
+            document_ids = ", ".join(f'"{id}"' for id in document_ids_filter)
-            filter = f'metadata["document_id"] in ({document_ids})'
+            filter = f'metadata["document_id"] in [{document_ids}]'
        results = self._client.search(
            collection_name=self._collection_name,
            data=[query_vector],
@ -259,7 +259,7 @@ class MilvusVector(BaseVector):
        filter = ""
        if document_ids_filter:
            document_ids = ", ".join(f"'{id}'" for id in document_ids_filter)
-            filter = f'metadata["document_id"] in ({document_ids})'
+            filter = f'metadata["document_id"] in [{document_ids}]'
        results = self._client.search(
            collection_name=self._collection_name,
--- a/api/core/rag/datasource/vdb/oceanbase/oceanbase_vector.py
+++ b/api/core/rag/datasource/vdb/oceanbase/oceanbase_vector.py
@ -31,6 +31,7 @@ class OceanBaseVectorConfig(BaseModel):
    user: str
    password: str
    database: str
    enable_hybrid_search: bool = False
    @model_validator(mode="before")
    @classmethod
@ -57,6 +58,7 @@ class OceanBaseVector(BaseVector):
            password=self._config.password,
            db_name=self._config.database,
        )
        self._hybrid_search_enabled = self._check_hybrid_search_support()  # Check if hybrid search is supported
    def get_type(self) -> str:
        return VectorType.OCEANBASE
@ -98,6 +100,16 @@ class OceanBaseVector(BaseVector):
                columns=cols,
                vidxs=vidx_params,
            )
            try:
                if self._hybrid_search_enabled:
                    self._client.perform_raw_text_sql(f"""ALTER TABLE {self._collection_name}
                    ADD FULLTEXT INDEX fulltext_index_for_col_text (text) WITH PARSER ik""")
            except Exception as e:
                raise Exception(
                    "Failed to add fulltext index to the target table, your OceanBase version must be 4.3.5.1 or above "
                    + "to support fulltext index and vector index in the same table",
                    e,
                )
            vals = []
            params = self._client.perform_raw_text_sql("SHOW PARAMETERS LIKE '%ob_vector_memory_limit_percentage%'")
            for row in params:
@ -116,6 +128,27 @@ class OceanBaseVector(BaseVector):
                    )
            redis_client.set(collection_exist_cache_key, 1, ex=3600)
    def _check_hybrid_search_support(self) -> bool:
        """
        Check if the current OceanBase version supports hybrid search.
        Returns True if the version is >= 4.3.5.1, otherwise False.
        """
        if not self._config.enable_hybrid_search:
            return False
        try:
            from packaging import version
            # return OceanBase_CE 4.3.5.1 (r101000042025031818-bxxxx) (Built Mar 18 2025 18:13:36)
            result = self._client.perform_raw_text_sql("SELECT @@version_comment AS version")
            ob_full_version = result.fetchone()[0]
            ob_version = ob_full_version.split()[1]
            logger.debug("Current OceanBase version is %s", ob_version)
            return version.parse(ob_version).base_version >= version.parse("4.3.5.1").base_version
        except Exception as e:
            logger.warning(f"Failed to check OceanBase version: {str(e)}. Disabling hybrid search.")
            return False
    def add_texts(self, documents: list[Document], embeddings: list[list[float]], **kwargs):
        ids = self._get_uuids(documents)
        for id, doc, emb in zip(ids, documents, embeddings):
@ -130,7 +163,7 @@ class OceanBaseVector(BaseVector):
            )
    def text_exists(self, id: str) -> bool:
-        cur = self._client.get(table_name=self._collection_name, id=id)
+        cur = self._client.get(table_name=self._collection_name, ids=id)
        return bool(cur.rowcount != 0)
    def delete_by_ids(self, ids: list[str]) -> None:
@ -139,9 +172,12 @@ class OceanBaseVector(BaseVector):
        self._client.delete(table_name=self._collection_name, ids=ids)
    def get_ids_by_metadata_field(self, key: str, value: str) -> list[str]:
        from sqlalchemy import text
        cur = self._client.get(
            table_name=self._collection_name,
-            where_clause=f"metadata->>'$.{key}' = '{value}'",
+            ids=None,
            where_clause=[text(f"metadata->>'$.{key}' = '{value}'")],
            output_column_name=["id"],
        )
        return [row[0] for row in cur]
@ -151,19 +187,65 @@ class OceanBaseVector(BaseVector):
        self.delete_by_ids(ids)
    def search_by_full_text(self, query: str, **kwargs: Any) -> list[Document]:
        if not self._hybrid_search_enabled:
            return []
        try:
            top_k = kwargs.get("top_k", 5)
            if not isinstance(top_k, int) or top_k <= 0:
                raise ValueError("top_k must be a positive integer")
            document_ids_filter = kwargs.get("document_ids_filter")
            where_clause = ""
            if document_ids_filter:
                document_ids = ", ".join(f"'{id}'" for id in document_ids_filter)
                where_clause = f" AND metadata->>'$.document_id' IN ({document_ids})"
            full_sql = f"""SELECT metadata, text, MATCH (text) AGAINST (:query) AS score
            FROM {self._collection_name}
            WHERE MATCH (text) AGAINST (:query) > 0 
            {where_clause}
            ORDER BY score DESC
            LIMIT {top_k}"""
            with self._client.engine.connect() as conn:
                with conn.begin():
                    from sqlalchemy import text
                    result = conn.execute(text(full_sql), {"query": query})
                    rows = result.fetchall()
                    docs = []
                    for row in rows:
                        metadata_str, _text, score = row
                        try:
                            metadata = json.loads(metadata_str)
                        except json.JSONDecodeError:
                            print(f"Invalid JSON metadata: {metadata_str}")
                            metadata = {}
                        metadata["score"] = score
                        docs.append(Document(page_content=_text, metadata=metadata))
                    return docs
        except Exception as e:
            logger.warning(f"Failed to fulltext search: {str(e)}.")
            return []
    def search_by_vector(self, query_vector: list[float], **kwargs: Any) -> list[Document]:
        document_ids_filter = kwargs.get("document_ids_filter")
-        where_clause = None
+        _where_clause = None
        if document_ids_filter:
            document_ids = ", ".join(f"'{id}'" for id in document_ids_filter)
            where_clause = f"metadata->>'$.document_id' in ({document_ids})"
            from sqlalchemy import text
            _where_clause = [text(where_clause)]
        ef_search = kwargs.get("ef_search", self._hnsw_ef_search)
        if ef_search != self._hnsw_ef_search:
            self._client.set_ob_hnsw_ef_search(ef_search)
            self._hnsw_ef_search = ef_search
        topk = kwargs.get("top_k", 10)
        try:
            cur = self._client.ann_search(
                table_name=self._collection_name,
                vec_column_name="vector",
@ -172,15 +254,17 @@ class OceanBaseVector(BaseVector):
                distance_func=func.l2_distance,
                output_column_names=["text", "metadata"],
                with_dist=True,
-            where_clause=where_clause,
+                where_clause=_where_clause,
            )
        except Exception as e:
            raise Exception("Failed to search by vector. ", e)
        docs = []
-        for text, metadata, distance in cur:
+        for _text, metadata, distance in cur:
            metadata = json.loads(metadata)
            metadata["score"] = 1 - distance / math.sqrt(2)
            docs.append(
                Document(
-                    page_content=text,
+                    page_content=_text,
                    metadata=metadata,
                )
            )
@ -212,5 +296,6 @@ class OceanBaseVectorFactory(AbstractVectorFactory):
                user=dify_config.OCEANBASE_VECTOR_USER or "",
                password=(dify_config.OCEANBASE_VECTOR_PASSWORD or ""),
                database=dify_config.OCEANBASE_VECTOR_DATABASE or "",
                enable_hybrid_search=dify_config.OCEANBASE_ENABLE_HYBRID_SEARCH or False,
            ),
        )
--- a/api/core/rag/datasource/vdb/opengauss/opengauss.py
+++ b/api/core/rag/datasource/vdb/opengauss/opengauss.py
@ -25,6 +25,7 @@ class OpenGaussConfig(BaseModel):
    database: str
    min_connection: int
    max_connection: int
    enable_pq: bool = False  # Enable PQ acceleration
    @model_validator(mode="before")
    @classmethod
@ -57,6 +58,11 @@ CREATE TABLE IF NOT EXISTS {table_name} (
 );
 """
 SQL_CREATE_INDEX_PQ = """
 CREATE INDEX IF NOT EXISTS embedding_{table_name}_pq_idx ON {table_name} 
 USING hnsw (embedding vector_cosine_ops) WITH (m = 16, ef_construction = 64, enable_pq=on, pq_m={pq_m});
 """
 SQL_CREATE_INDEX = """
 CREATE INDEX IF NOT EXISTS embedding_cosine_{table_name}_idx ON {table_name} 
 USING hnsw (embedding vector_cosine_ops) WITH (m = 16, ef_construction = 64);
@ -68,6 +74,7 @@ class OpenGauss(BaseVector):
        super().__init__(collection_name)
        self.pool = self._create_connection_pool(config)
        self.table_name = f"embedding_{collection_name}"
        self.pq_enabled = config.enable_pq
    def get_type(self) -> str:
        return VectorType.OPENGAUSS
@ -97,7 +104,26 @@ class OpenGauss(BaseVector):
    def create(self, texts: list[Document], embeddings: list[list[float]], **kwargs):
        dimension = len(embeddings[0])
        self._create_collection(dimension)
-        return self.add_texts(texts, embeddings)
+        self.add_texts(texts, embeddings)
        self._create_index(dimension)
    def _create_index(self, dimension: int):
        index_cache_key = f"vector_index_{self._collection_name}"
        lock_name = f"{index_cache_key}_lock"
        with redis_client.lock(lock_name, timeout=60):
            index_exist_cache_key = f"vector_index_{self._collection_name}"
            if redis_client.get(index_exist_cache_key):
                return
            with self._get_cursor() as cur:
                if dimension <= 2000:
                    if self.pq_enabled:
                        cur.execute(SQL_CREATE_INDEX_PQ.format(table_name=self.table_name, pq_m=int(dimension / 4)))
                        cur.execute("SET hnsw_earlystop_threshold = 320")
                    if not self.pq_enabled:
                        cur.execute(SQL_CREATE_INDEX.format(table_name=self.table_name))
            redis_client.set(index_exist_cache_key, 1, ex=3600)
    def add_texts(self, documents: list[Document], embeddings: list[list[float]], **kwargs):
        values = []
@ -151,7 +177,6 @@ class OpenGauss(BaseVector):
        Search the nearest neighbors to a vector.
        :param query_vector: The input vector to search for similar items.
        :param top_k: The number of nearest neighbors to return, default is 5.
        :return: List of Documents that are nearest to the query vector.
        """
        top_k = kwargs.get("top_k", 4)
@ -211,8 +236,6 @@ class OpenGauss(BaseVector):
            with self._get_cursor() as cur:
                cur.execute(SQL_CREATE_TABLE.format(table_name=self.table_name, dimension=dimension))
                if dimension <= 2000:
                    cur.execute(SQL_CREATE_INDEX.format(table_name=self.table_name))
            redis_client.set(collection_exist_cache_key, 1, ex=3600)
@ -236,5 +259,6 @@ class OpenGaussFactory(AbstractVectorFactory):
                database=dify_config.OPENGAUSS_DATABASE or "dify",
                min_connection=dify_config.OPENGAUSS_MIN_CONNECTION,
                max_connection=dify_config.OPENGAUSS_MAX_CONNECTION,
                enable_pq=dify_config.OPENGAUSS_ENABLE_PQ or False,
            ),
        )
--- a/api/core/rag/datasource/vdb/oracle/oraclevector.py
+++ b/api/core/rag/datasource/vdb/oracle/oraclevector.py
@ -197,7 +197,6 @@ class OracleVector(BaseVector):
        Search the nearest neighbors to a vector.
        :param query_vector: The input vector to search for similar items.
        :param top_k: The number of nearest neighbors to return, default is 5.
        :return: List of Documents that are nearest to the query vector.
        """
        top_k = kwargs.get("top_k", 4)
--- a/api/core/rag/datasource/vdb/pgvector/pgvector.py
+++ b/api/core/rag/datasource/vdb/pgvector/pgvector.py
@ -167,7 +167,6 @@ class PGVector(BaseVector):
        Search the nearest neighbors to a vector.
        :param query_vector: The input vector to search for similar items.
        :param top_k: The number of nearest neighbors to return, default is 5.
        :return: List of Documents that are nearest to the query vector.
        """
        top_k = kwargs.get("top_k", 4)
@ -177,7 +176,7 @@ class PGVector(BaseVector):
        where_clause = ""
        if document_ids_filter:
            document_ids = ", ".join(f"'{id}'" for id in document_ids_filter)
-            where_clause = f" WHERE metadata->>'document_id' in ({document_ids}) "
+            where_clause = f" WHERE meta->>'document_id' in ({document_ids}) "
        with self._get_cursor() as cur:
            cur.execute(
@ -205,7 +204,7 @@ class PGVector(BaseVector):
            where_clause = ""
            if document_ids_filter:
                document_ids = ", ".join(f"'{id}'" for id in document_ids_filter)
-                where_clause = f" AND metadata->>'document_id' in ({document_ids}) "
+                where_clause = f" AND meta->>'document_id' in ({document_ids}) "
            if self.pg_bigm:
                cur.execute("SET pg_bigm.similarity_limit TO 0.000001")
                cur.execute(
--- a/api/core/rag/datasource/vdb/tablestore/init.py
+++ b/api/core/rag/datasource/vdb/tablestore/init.py
--- a/api/core/rag/datasource/vdb/tablestore/tablestore_vector.py
+++ b/api/core/rag/datasource/vdb/tablestore/tablestore_vector.py
@ -0,0 +1,295 @@
 import json
 import logging
 from typing import Any, Optional
 import tablestore  # type: ignore
 from pydantic import BaseModel, model_validator
 from configs import dify_config
 from core.rag.datasource.vdb.field import Field
 from core.rag.datasource.vdb.vector_base import BaseVector
 from core.rag.datasource.vdb.vector_factory import AbstractVectorFactory
 from core.rag.datasource.vdb.vector_type import VectorType
 from core.rag.embedding.embedding_base import Embeddings
 from core.rag.models.document import Document
 from extensions.ext_redis import redis_client
 from models import Dataset
 class TableStoreConfig(BaseModel):
    access_key_id: Optional[str] = None
    access_key_secret: Optional[str] = None
    instance_name: Optional[str] = None
    endpoint: Optional[str] = None
    @model_validator(mode="before")
    @classmethod
    def validate_config(cls, values: dict) -> dict:
        if not values["access_key_id"]:
            raise ValueError("config ACCESS_KEY_ID is required")
        if not values["access_key_secret"]:
            raise ValueError("config ACCESS_KEY_SECRET is required")
        if not values["instance_name"]:
            raise ValueError("config INSTANCE_NAME is required")
        if not values["endpoint"]:
            raise ValueError("config ENDPOINT is required")
        return values
 class TableStoreVector(BaseVector):
    def __init__(self, collection_name: str, config: TableStoreConfig):
        super().__init__(collection_name)
        self._config = config
        self._tablestore_client = tablestore.OTSClient(
            config.endpoint,
            config.access_key_id,
            config.access_key_secret,
            config.instance_name,
        )
        self._table_name = f"{collection_name}"
        self._index_name = f"{collection_name}_idx"
        self._tags_field = f"{Field.METADATA_KEY.value}_tags"
    def get_type(self) -> str:
        return VectorType.TABLESTORE
    def create(self, texts: list[Document], embeddings: list[list[float]], **kwargs):
        dimension = len(embeddings[0])
        self._create_collection(dimension)
        self.add_texts(documents=texts, embeddings=embeddings, **kwargs)
    def add_texts(self, documents: list[Document], embeddings: list[list[float]], **kwargs):
        uuids = self._get_uuids(documents)
        for i in range(len(documents)):
            self._write_row(
                primary_key=uuids[i],
                attributes={
                    Field.CONTENT_KEY.value: documents[i].page_content,
                    Field.VECTOR.value: embeddings[i],
                    Field.METADATA_KEY.value: documents[i].metadata,
                },
            )
        return uuids
    def text_exists(self, id: str) -> bool:
        _, return_row, _ = self._tablestore_client.get_row(
            table_name=self._table_name, primary_key=[("id", id)], columns_to_get=["id"]
        )
        return return_row is not None
    def delete_by_ids(self, ids: list[str]) -> None:
        if not ids:
            return
        for id in ids:
            self._delete_row(id=id)
    def get_ids_by_metadata_field(self, key: str, value: str):
        return self._search_by_metadata(key, value)
    def delete_by_metadata_field(self, key: str, value: str) -> None:
        ids = self.get_ids_by_metadata_field(key, value)
        self.delete_by_ids(ids)
    def search_by_vector(self, query_vector: list[float], **kwargs: Any) -> list[Document]:
        top_k = kwargs.get("top_k", 4)
        return self._search_by_vector(query_vector, top_k)
    def search_by_full_text(self, query: str, **kwargs: Any) -> list[Document]:
        return self._search_by_full_text(query)
    def delete(self) -> None:
        self._delete_table_if_exist()
    def _create_collection(self, dimension: int):
        lock_name = f"vector_indexing_lock_{self._collection_name}"
        with redis_client.lock(lock_name, timeout=20):
            collection_exist_cache_key = f"vector_indexing_{self._collection_name}"
            if redis_client.get(collection_exist_cache_key):
                logging.info(f"Collection {self._collection_name} already exists.")
                return
            self._create_table_if_not_exist()
            self._create_search_index_if_not_exist(dimension)
            redis_client.set(collection_exist_cache_key, 1, ex=3600)
    def _create_table_if_not_exist(self) -> None:
        table_list = self._tablestore_client.list_table()
        if self._table_name in table_list:
            logging.info("Tablestore system table[%s] already exists", self._table_name)
            return None
        schema_of_primary_key = [("id", "STRING")]
        table_meta = tablestore.TableMeta(self._table_name, schema_of_primary_key)
        table_options = tablestore.TableOptions()
        reserved_throughput = tablestore.ReservedThroughput(tablestore.CapacityUnit(0, 0))
        self._tablestore_client.create_table(table_meta, table_options, reserved_throughput)
        logging.info("Tablestore create table[%s] successfully.", self._table_name)
    def _create_search_index_if_not_exist(self, dimension: int) -> None:
        search_index_list = self._tablestore_client.list_search_index(table_name=self._table_name)
        if self._index_name in [t[1] for t in search_index_list]:
            logging.info("Tablestore system index[%s] already exists", self._index_name)
            return None
        field_schemas = [
            tablestore.FieldSchema(
                Field.CONTENT_KEY.value,
                tablestore.FieldType.TEXT,
                analyzer=tablestore.AnalyzerType.MAXWORD,
                index=True,
                enable_sort_and_agg=False,
                store=False,
            ),
            tablestore.FieldSchema(
                Field.VECTOR.value,
                tablestore.FieldType.VECTOR,
                vector_options=tablestore.VectorOptions(
                    data_type=tablestore.VectorDataType.VD_FLOAT_32,
                    dimension=dimension,
                    metric_type=tablestore.VectorMetricType.VM_COSINE,
                ),
            ),
            tablestore.FieldSchema(
                Field.METADATA_KEY.value,
                tablestore.FieldType.KEYWORD,
                index=True,
                store=False,
            ),
            tablestore.FieldSchema(
                self._tags_field,
                tablestore.FieldType.KEYWORD,
                index=True,
                store=False,
                is_array=True,
            ),
        ]
        index_meta = tablestore.SearchIndexMeta(field_schemas)
        self._tablestore_client.create_search_index(self._table_name, self._index_name, index_meta)
        logging.info("Tablestore create system index[%s] successfully.", self._index_name)
    def _delete_table_if_exist(self):
        search_index_list = self._tablestore_client.list_search_index(table_name=self._table_name)
        for resp_tuple in search_index_list:
            self._tablestore_client.delete_search_index(resp_tuple[0], resp_tuple[1])
            logging.info("Tablestore delete index[%s] successfully.", self._index_name)
        self._tablestore_client.delete_table(self._table_name)
        logging.info("Tablestore delete system table[%s] successfully.", self._index_name)
    def _delete_search_index(self) -> None:
        self._tablestore_client.delete_search_index(self._table_name, self._index_name)
        logging.info("Tablestore delete index[%s] successfully.", self._index_name)
    def _write_row(self, primary_key: str, attributes: dict[str, Any]) -> None:
        pk = [("id", primary_key)]
        tags = []
        for key, value in attributes[Field.METADATA_KEY.value].items():
            tags.append(str(key) + "=" + str(value))
        attribute_columns = [
            (Field.CONTENT_KEY.value, attributes[Field.CONTENT_KEY.value]),
            (Field.VECTOR.value, json.dumps(attributes[Field.VECTOR.value])),
            (
                Field.METADATA_KEY.value,
                json.dumps(attributes[Field.METADATA_KEY.value]),
            ),
            (self._tags_field, json.dumps(tags)),
        ]
        row = tablestore.Row(pk, attribute_columns)
        self._tablestore_client.put_row(self._table_name, row)
    def _delete_row(self, id: str) -> None:
        primary_key = [("id", id)]
        row = tablestore.Row(primary_key)
        self._tablestore_client.delete_row(self._table_name, row, None)
        logging.info("Tablestore delete row successfully. id:%s", id)
    def _search_by_metadata(self, key: str, value: str) -> list[str]:
        query = tablestore.SearchQuery(
            tablestore.TermQuery(self._tags_field, str(key) + "=" + str(value)),
            limit=100,
            get_total_count=False,
        )
        search_response = self._tablestore_client.search(
            table_name=self._table_name,
            index_name=self._index_name,
            search_query=query,
            columns_to_get=tablestore.ColumnsToGet(return_type=tablestore.ColumnReturnType.ALL_FROM_INDEX),
        )
        return [row[0][0][1] for row in search_response.rows]
    def _search_by_vector(self, query_vector: list[float], top_k: int) -> list[Document]:
        ots_query = tablestore.KnnVectorQuery(
            field_name=Field.VECTOR.value,
            top_k=top_k,
            float32_query_vector=query_vector,
        )
        sort = tablestore.Sort(sorters=[tablestore.ScoreSort(sort_order=tablestore.SortOrder.DESC)])
        search_query = tablestore.SearchQuery(ots_query, limit=top_k, get_total_count=False, sort=sort)
        search_response = self._tablestore_client.search(
            table_name=self._table_name,
            index_name=self._index_name,
            search_query=search_query,
            columns_to_get=tablestore.ColumnsToGet(return_type=tablestore.ColumnReturnType.ALL_FROM_INDEX),
        )
        logging.info(
            "Tablestore search successfully. request_id:%s",
            search_response.request_id,
        )
        return self._to_query_result(search_response)
    def _to_query_result(self, search_response: tablestore.SearchResponse) -> list[Document]:
        documents = []
        for row in search_response.rows:
            documents.append(
                Document(
                    page_content=row[1][2][1],
                    vector=json.loads(row[1][3][1]),
                    metadata=json.loads(row[1][0][1]),
                )
            )
        return documents
    def _search_by_full_text(self, query: str) -> list[Document]:
        search_query = tablestore.SearchQuery(
            query=tablestore.MatchQuery(text=query, field_name=Field.CONTENT_KEY.value),
            sort=tablestore.Sort(sorters=[tablestore.ScoreSort(sort_order=tablestore.SortOrder.DESC)]),
            limit=100,
        )
        search_response = self._tablestore_client.search(
            table_name=self._table_name,
            index_name=self._index_name,
            search_query=search_query,
            columns_to_get=tablestore.ColumnsToGet(return_type=tablestore.ColumnReturnType.ALL_FROM_INDEX),
        )
        return self._to_query_result(search_response)
 class TableStoreVectorFactory(AbstractVectorFactory):
    def init_vector(self, dataset: Dataset, attributes: list, embeddings: Embeddings) -> TableStoreVector:
        if dataset.index_struct_dict:
            class_prefix: str = dataset.index_struct_dict["vector_store"]["class_prefix"]
            collection_name = class_prefix
        else:
            dataset_id = dataset.id
            collection_name = Dataset.gen_collection_name_by_id(dataset_id)
            dataset.index_struct = json.dumps(self.gen_index_struct_dict(VectorType.TABLESTORE, collection_name))
        return TableStoreVector(
            collection_name=collection_name,
            config=TableStoreConfig(
                endpoint=dify_config.TABLESTORE_ENDPOINT,
                instance_name=dify_config.TABLESTORE_INSTANCE_NAME,
                access_key_id=dify_config.TABLESTORE_ACCESS_KEY_ID,
                access_key_secret=dify_config.TABLESTORE_ACCESS_KEY_SECRET,
            ),
        )
--- a/api/core/rag/datasource/vdb/tencent/tencent_vector.py
+++ b/api/core/rag/datasource/vdb/tencent/tencent_vector.py
@ -1,8 +1,9 @@
 import json
 import math
 from typing import Any, Optional
 from pydantic import BaseModel
-from tcvectordb import VectorDBClient  # type: ignore
+from tcvectordb import RPCVectorDBClient, VectorDBException  # type: ignore
 from tcvectordb.model import document, enum  # type: ignore
 from tcvectordb.model import index as vdb_index  # type: ignore
 from tcvectordb.model.document import Filter  # type: ignore
@ -27,6 +28,7 @@ class TencentConfig(BaseModel):
    metric_type: str = "L2"
    shard: int = 1
    replicas: int = 2
    max_upsert_batch_size: int = 128
    def to_tencent_params(self):
        return {"url": self.url, "username": self.username, "key": self.api_key, "timeout": self.timeout}
@ -41,19 +43,10 @@ class TencentVector(BaseVector):
    def __init__(self, collection_name: str, config: TencentConfig):
        super().__init__(collection_name)
        self._client_config = config
-        self._client = VectorDBClient(**self._client_config.to_tencent_params())
+        self._client = RPCVectorDBClient(**self._client_config.to_tencent_params())
        self._db = self._init_database()
    def _init_database(self):
-        exists = False
+        return self._client.create_database_if_not_exists(database_name=self._client_config.database)
        for db in self._client.list_databases():
            if db.database_name == self._client_config.database:
                exists = True
                break
        if exists:
            return self._client.database(self._client_config.database)
        else:
            return self._client.create_database(database_name=self._client_config.database)
    def get_type(self) -> str:
        return VectorType.TENCENT
@ -62,8 +55,11 @@ class TencentVector(BaseVector):
        return {"type": self.get_type(), "vector_store": {"class_prefix": self._collection_name}}
    def _has_collection(self) -> bool:
-        collections = self._db.list_collections()
+        return bool(
-        return any(collection.collection_name == self._collection_name for collection in collections)
+            self._client.exists_collection(
                database_name=self._client_config.database, collection_name=self.collection_name
            )
        )
    def _create_collection(self, dimension: int) -> None:
        lock_name = "vector_indexing_lock_{}".format(self._collection_name)
@ -75,7 +71,6 @@ class TencentVector(BaseVector):
            if self._has_collection():
                return
            self.delete()
            index_type = None
            for k, v in enum.IndexType.__members__.items():
                if k == self._client_config.index_type:
@ -89,6 +84,31 @@ class TencentVector(BaseVector):
            if metric_type is None:
                raise ValueError("unsupported metric_type")
            params = vdb_index.HNSWParams(m=16, efconstruction=200)
            index = vdb_index.Index(
                vdb_index.FilterIndex(self.field_id, enum.FieldType.String, enum.IndexType.PRIMARY_KEY),
                vdb_index.VectorIndex(
                    self.field_vector,
                    dimension,
                    index_type,
                    metric_type,
                    params,
                ),
                vdb_index.FilterIndex(self.field_text, enum.FieldType.String, enum.IndexType.FILTER),
                vdb_index.FilterIndex(self.field_metadata, enum.FieldType.Json, enum.IndexType.FILTER),
            )
            try:
                self._client.create_collection(
                    database_name=self._client_config.database,
                    collection_name=self._collection_name,
                    shard=self._client_config.shard,
                    replicas=self._client_config.replicas,
                    description="Collection for Dify",
                    index=index,
                )
            except VectorDBException as e:
                if "fieldType:json" not in e.message:
                    raise e
                # vdb version not support json, use string
                index = vdb_index.Index(
                    vdb_index.FilterIndex(self.field_id, enum.FieldType.String, enum.IndexType.PRIMARY_KEY),
                    vdb_index.VectorIndex(
@ -101,9 +121,9 @@ class TencentVector(BaseVector):
                    vdb_index.FilterIndex(self.field_text, enum.FieldType.String, enum.IndexType.FILTER),
                    vdb_index.FilterIndex(self.field_metadata, enum.FieldType.String, enum.IndexType.FILTER),
                )
-
+                self._client.create_collection(
-            self._db.create_collection(
+                    database_name=self._client_config.database,
-                name=self._collection_name,
+                    collection_name=self._collection_name,
                    shard=self._client_config.shard,
                    replicas=self._client_config.replicas,
                    description="Collection for Dify",
@ -119,8 +139,13 @@ class TencentVector(BaseVector):
        texts = [doc.page_content for doc in documents]
        metadatas = [doc.metadata for doc in documents]
        total_count = len(embeddings)
        batch_size = self._client_config.max_upsert_batch_size
        batch = math.ceil(total_count / batch_size)
        for j in range(batch):
            docs = []
-        for i in range(0, total_count):
+            start_idx = j * batch_size
            end_idx = min(total_count, (j + 1) * batch_size)
            for i in range(start_idx, end_idx):
                if metadatas is None:
                    continue
                metadata = metadatas[i] or {}
@ -128,13 +153,20 @@ class TencentVector(BaseVector):
                    id=metadata.get("doc_id"),
                    vector=embeddings[i],
                    text=texts[i],
-                metadata=json.dumps(metadata),
+                    metadata=metadata,
                )
                docs.append(doc)
-        self._db.collection(self._collection_name).upsert(docs, self._client_config.timeout)
+            self._client.upsert(
                database_name=self._client_config.database,
                collection_name=self.collection_name,
                documents=docs,
                timeout=self._client_config.timeout,
            )
    def text_exists(self, id: str) -> bool:
-        docs = self._db.collection(self._collection_name).query(document_ids=[id])
+        docs = self._client.query(
            database_name=self._client_config.database, collection_name=self.collection_name, document_ids=[id]
        )
        if docs and len(docs) > 0:
            return True
        return False
@ -142,17 +174,25 @@ class TencentVector(BaseVector):
    def delete_by_ids(self, ids: list[str]) -> None:
        if not ids:
            return
-        self._db.collection(self._collection_name).delete(document_ids=ids)
+        self._client.delete(
            database_name=self._client_config.database, collection_name=self.collection_name, document_ids=ids
        )
    def delete_by_metadata_field(self, key: str, value: str) -> None:
-        self._db.collection(self._collection_name).delete(filter=Filter(Filter.In(f"metadata.{key}", [value])))
+        self._client.delete(
            database_name=self._client_config.database,
            collection_name=self.collection_name,
            filter=Filter(Filter.In(f"metadata.{key}", [value])),
        )
    def search_by_vector(self, query_vector: list[float], **kwargs: Any) -> list[Document]:
        document_ids_filter = kwargs.get("document_ids_filter")
        filter = None
        if document_ids_filter:
            filter = Filter(Filter.In("metadata.document_id", document_ids_filter))
-        res = self._db.collection(self._collection_name).search(
+        res = self._client.search(
            database_name=self._client_config.database,
            collection_name=self.collection_name,
            vectors=[query_vector],
            filter=filter,
            params=document.HNSWSearchParams(ef=kwargs.get("ef", 10)),
@ -173,8 +213,6 @@ class TencentVector(BaseVector):
        for result in res[0]:
            meta = result.get(self.field_metadata)
            if meta is not None:
                meta = json.loads(meta)
            score = 1 - result.get("score", 0.0)
            if score > score_threshold:
                meta["score"] = score
@ -184,7 +222,7 @@ class TencentVector(BaseVector):
        return docs
    def delete(self) -> None:
-        self._db.drop_collection(name=self._collection_name)
+        self._client.drop_collection(database_name=self._client_config.database, collection_name=self.collection_name)
 class TencentVectorFactory(AbstractVectorFactory):
--- a/api/core/rag/datasource/vdb/tidb_on_qdrant/tidb_service.py
+++ b/api/core/rag/datasource/vdb/tidb_on_qdrant/tidb_service.py
@ -22,7 +22,6 @@ class TidbService:
        :param iam_url: The URL of the TiDB Cloud IAM API (required).
        :param public_key: The public key for the API (required).
        :param private_key: The private key for the API (required).
        :param display_name: The user-friendly display name of the cluster (required).
        :param region: The region where the cluster will be created (required).
        :return: The response from the API.
@ -149,13 +148,12 @@ class TidbService:
    ):
        """
        Update the status of a new TiDB Serverless cluster.
        :param tidb_serverless_list: The TiDB serverless list (required).
        :param project_id: The project ID of the TiDB Cloud project (required).
        :param api_url: The URL of the TiDB Cloud API (required).
        :param iam_url: The URL of the TiDB Cloud IAM API (required).
        :param public_key: The public key for the API (required).
        :param private_key: The private key for the API (required).
        :param display_name: The user-friendly display name of the cluster (required).
        :param region: The region where the cluster will be created (required).
        :return: The response from the API.
        """
@ -186,12 +184,12 @@ class TidbService:
    ) -> list[dict]:
        """
        Creates a new TiDB Serverless cluster.
        :param batch_size: The batch size (required).
        :param project_id: The project ID of the TiDB Cloud project (required).
        :param api_url: The URL of the TiDB Cloud API (required).
        :param iam_url: The URL of the TiDB Cloud IAM API (required).
        :param public_key: The public key for the API (required).
        :param private_key: The private key for the API (required).
        :param display_name: The user-friendly display name of the cluster (required).
        :param region: The region where the cluster will be created (required).
        :return: The response from the API.
--- a/api/core/rag/datasource/vdb/vector_factory.py
+++ b/api/core/rag/datasource/vdb/vector_factory.py
@ -152,6 +152,10 @@ class Vector:
                from core.rag.datasource.vdb.opengauss.opengauss import OpenGaussFactory
                return OpenGaussFactory
            case VectorType.TABLESTORE:
                from core.rag.datasource.vdb.tablestore.tablestore_vector import TableStoreVectorFactory
                return TableStoreVectorFactory
            case _:
                raise ValueError(f"Vector store {vector_type} is not supported.")
--- a/api/core/rag/datasource/vdb/vector_type.py
+++ b/api/core/rag/datasource/vdb/vector_type.py
@ -25,3 +25,4 @@ class VectorType(StrEnum):
    TIDB_ON_QDRANT = "tidb_on_qdrant"
    OCEANBASE = "oceanbase"
    OPENGAUSS = "opengauss"
    TABLESTORE = "tablestore"
--- a/api/core/rag/datasource/vdb/weaviate/weaviate_vector.py
+++ b/api/core/rag/datasource/vdb/weaviate/weaviate_vector.py
@ -226,7 +226,6 @@ class WeaviateVector(BaseVector):
        Args:
            query: Text to look up documents similar to.
            k: Number of Documents to return. Defaults to 4.
        Returns:
            List of Documents most similar to the query.
--- a/api/core/rag/extractor/firecrawl/firecrawl_web_extractor.py
+++ b/api/core/rag/extractor/firecrawl/firecrawl_web_extractor.py
@ -7,11 +7,10 @@ class FirecrawlWebExtractor(BaseExtractor):
    """
    Crawl and scrape websites and return content in clean llm-ready markdown.
    Args:
        url: The URL to scrape.
-        api_key: The API key for Firecrawl.
+        job_id: The crawl job id.
-        base_url: The base URL for the Firecrawl API. Defaults to 'https://api.firecrawl.dev'.
+        tenant_id: The tenant id.
        mode: The mode of operation. Defaults to 'scrape'. Options are 'crawl', 'scrape' and 'crawl_return_urls'.
        only_main_content: Only return the main content of the page excluding headers, navs, footers, etc.
    """
--- a/api/core/rag/extractor/unstructured/unstructured_epub_extractor.py
+++ b/api/core/rag/extractor/unstructured/unstructured_epub_extractor.py
@ -1,6 +1,8 @@
 import logging
 from typing import Optional
 import pypandoc  # type: ignore
 from core.rag.extractor.extractor_base import BaseExtractor
 from core.rag.models.document import Document
@ -34,6 +36,7 @@ class UnstructuredEpubExtractor(BaseExtractor):
        else:
            from unstructured.partition.epub import partition_epub
            pypandoc.download_pandoc()
            elements = partition_epub(filename=self._file_path, xml_keep_tags=True)
        from unstructured.chunking.title import chunk_by_title
--- a/api/core/rag/extractor/unstructured/unstructured_markdown_extractor.py
+++ b/api/core/rag/extractor/unstructured/unstructured_markdown_extractor.py
@ -14,15 +14,6 @@ class UnstructuredMarkdownExtractor(BaseExtractor):
    Args:
        file_path: Path to the file to load.
        remove_hyperlinks: Whether to remove hyperlinks from the text.
        remove_images: Whether to remove images from the text.
        encoding: File encoding to use. If `None`, the file will be loaded
        with the default system encoding.
        autodetect_encoding: Whether to try to autodetect the file encoding
            if the specified encoding fails.
    """
    def __init__(self, file_path: str, api_url: Optional[str] = None, api_key: str = ""):
--- a/api/core/rag/index_processor/constant/built_in_field.py
+++ b/api/core/rag/index_processor/constant/built_in_field.py
@ -1,7 +1,7 @@
-from enum import Enum
+from enum import Enum, StrEnum
-class BuiltInField(str, Enum):
+class BuiltInField(StrEnum):
    document_name = "document_name"
    uploader = "uploader"
    upload_date = "upload_date"
--- a/api/core/rag/index_processor/constant/index_type.py
+++ b/api/core/rag/index_processor/constant/index_type.py
@ -1,7 +1,7 @@
-from enum import Enum
+from enum import StrEnum
-class IndexType(str, Enum):
+class IndexType(StrEnum):
    PARAGRAPH_INDEX = "text_model"
    QA_INDEX = "qa_model"
    PARENT_CHILD_INDEX = "hierarchical_model"
--- a/api/core/rag/index_processor/processor/parent_child_index_processor.py
+++ b/api/core/rag/index_processor/processor/parent_child_index_processor.py
@ -39,6 +39,8 @@ class ParentChildIndexProcessor(BaseIndexProcessor):
        all_documents = []  # type: ignore
        if rules.parent_mode == ParentMode.PARAGRAPH:
            # Split the text documents into nodes.
            if not rules.segmentation:
                raise ValueError("No segmentation found in rules.")
            splitter = self._get_splitter(
                processing_rule_mode=process_rule.get("mode"),
                max_tokens=rules.segmentation.max_tokens,
--- a/api/core/rag/retrieval/dataset_retrieval.py
+++ b/api/core/rag/retrieval/dataset_retrieval.py
@ -100,6 +100,7 @@ class DatasetRetrieval:
        :param hit_callback: hit callback
        :param message_id: message id
        :param memory: memory
        :param inputs: inputs
        :return:
        """
        dataset_ids = config.dataset_ids
@ -610,7 +611,11 @@ class DatasetRetrieval:
                if dataset.indexing_technique == "economy":
                    # use keyword table query
                    documents = RetrievalService.retrieve(
-                        retrieval_method="keyword_search", dataset_id=dataset.id, query=query, top_k=top_k
+                        retrieval_method="keyword_search",
                        dataset_id=dataset.id,
                        query=query,
                        top_k=top_k,
                        document_ids_filter=document_ids_filter,
                    )
                    if documents:
                        all_documents.extend(documents)
@ -730,6 +735,7 @@ class DatasetRetrieval:
        Calculate keywords scores
        :param query: search query
        :param documents: documents for reranking
        :param top_k: top k
        :return:
        """
@ -846,8 +852,9 @@ class DatasetRetrieval:
            )
            if automatic_metadata_filters:
                conditions = []
-                for filter in automatic_metadata_filters:
+                for sequence, filter in enumerate(automatic_metadata_filters):
                    self._process_metadata_filter_func(
                        sequence,
                        filter.get("condition"),  # type: ignore
                        filter.get("metadata_name"),  # type: ignore
                        filter.get("value"),
@ -867,14 +874,18 @@ class DatasetRetrieval:
        elif metadata_filtering_mode == "manual":
            if metadata_filtering_conditions:
                metadata_condition = MetadataCondition(**metadata_filtering_conditions.model_dump())
-                for condition in metadata_filtering_conditions.conditions:  # type: ignore
+                for sequence, condition in enumerate(metadata_filtering_conditions.conditions):  # type: ignore
                    metadata_name = condition.name
                    expected_value = condition.value
                    if expected_value is not None or condition.comparison_operator in ("empty", "not empty"):
                        if isinstance(expected_value, str):
                            expected_value = self._replace_metadata_filter_value(expected_value, inputs)
                        filters = self._process_metadata_filter_func(
-                            condition.comparison_operator, metadata_name, expected_value, filters
+                            sequence,
                            condition.comparison_operator,
                            metadata_name,
                            expected_value,
                            filters,
                        )
        else:
            raise ValueError("Invalid metadata filtering mode")
@ -896,7 +907,10 @@ class DatasetRetrieval:
            return str(inputs.get(key, f"{{{{{key}}}}}"))
        pattern = re.compile(r"\{\{(\w+)\}\}")
-        return pattern.sub(replacer, text)
+        output = pattern.sub(replacer, text)
        if isinstance(output, str):
            output = re.sub(r"[\r\n\t]+", " ", output).strip()
        return output
    def _automatic_metadata_filter_func(
        self, dataset_ids: list, query: str, tenant_id: str, user_id: str, metadata_model_config: ModelConfig
@ -953,26 +967,36 @@ class DatasetRetrieval:
            return None
        return automatic_metadata_filters
-    def _process_metadata_filter_func(self, condition: str, metadata_name: str, value: Optional[Any], filters: list):
+    def _process_metadata_filter_func(
        self, sequence: int, condition: str, metadata_name: str, value: Optional[Any], filters: list
    ):
        key = f"{metadata_name}_{sequence}"
        key_value = f"{metadata_name}_{sequence}_value"
        match condition:
            case "contains":
                filters.append(
-                    (text("documents.doc_metadata ->> :key LIKE :value")).params(key=metadata_name, value=f"%{value}%")
+                    (text(f"documents.doc_metadata ->> :{key} LIKE :{key_value}")).params(
                        **{key: metadata_name, key_value: f"%{value}%"}
                    )
                )
            case "not contains":
                filters.append(
-                    (text("documents.doc_metadata ->> :key NOT LIKE :value")).params(
+                    (text(f"documents.doc_metadata ->> :{key} NOT LIKE :{key_value}")).params(
-                        key=metadata_name, value=f"%{value}%"
+                        **{key: metadata_name, key_value: f"%{value}%"}
                    )
                )
            case "start with":
                filters.append(
-                    (text("documents.doc_metadata ->> :key LIKE :value")).params(key=metadata_name, value=f"{value}%")
+                    (text(f"documents.doc_metadata ->> :{key} LIKE :{key_value}")).params(
                        **{key: metadata_name, key_value: f"{value}%"}
                    )
                )
            case "end with":
                filters.append(
-                    (text("documents.doc_metadata ->> :key LIKE :value")).params(key=metadata_name, value=f"%{value}")
+                    (text(f"documents.doc_metadata ->> :{key} LIKE :{key_value}")).params(
                        **{key: metadata_name, key_value: f"%{value}"}
                    )
                )
            case "is" | "=":
                if isinstance(value, str):
@ -996,7 +1020,7 @@ class DatasetRetrieval:
                filters.append(sqlalchemy_cast(DatasetDocument.doc_metadata[metadata_name].astext, Integer) < value)
            case "after" | ">":
                filters.append(sqlalchemy_cast(DatasetDocument.doc_metadata[metadata_name].astext, Integer) > value)
-            case "≤" | ">=":
+            case "≤" | "<=":
                filters.append(sqlalchemy_cast(DatasetDocument.doc_metadata[metadata_name].astext, Integer) <= value)
            case "≥" | ">=":
                filters.append(sqlalchemy_cast(DatasetDocument.doc_metadata[metadata_name].astext, Integer) >= value)
@ -1009,8 +1033,6 @@ class DatasetRetrieval:
    ) -> tuple[ModelInstance, ModelConfigWithCredentialsEntity]:
        """
        Fetch model config
        :param node_data: node data
        :return:
        """
        if model is None:
            raise ValueError("single_retrieval_config is required")
--- a/api/core/rag/retrieval/router/multi_dataset_react_route.py
+++ b/api/core/rag/retrieval/router/multi_dataset_react_route.py
@ -235,6 +235,7 @@ class ReactMultiDatasetRouter:
            tools: List of tools the agent will have access to, used to format the
                prompt.
            prefix: String to put before the list of tools.
            format_instructions: The format instruction prompt.
        Returns:
            A PromptTemplate with the template assembled from the pieces here.
        """
--- a/api/core/tools/__base/tool.py
+++ b/api/core/tools/__base/tool.py
@ -29,9 +29,7 @@ class Tool(ABC):
    def fork_tool_runtime(self, runtime: ToolRuntime) -> "Tool":
        """
-        fork a new tool with meta data
+        fork a new tool with metadata
        :param meta: the meta data of a tool call processing, tenant_id is required
        :return: the new tool
        """
        return self.__class__(
@ -206,6 +204,7 @@ class Tool(ABC):
        create a blob message
        :param blob: the blob
        :param meta: the meta info of blob object
        :return: the blob message
        """
        return ToolInvokeMessage(
--- a/api/core/tools/builtin_tool/provider.py
+++ b/api/core/tools/builtin_tool/provider.py
@ -35,7 +35,7 @@ class BuiltinToolProviderController(ToolProviderController):
                provider_yaml["credentials_for_provider"][credential_name]["name"] = credential_name
        credentials_schema = []
-        for credential in provider_yaml.get("credentials_for_provider", {}):
+        for credential in provider_yaml.get("credentials_for_provider", {}).values():
            credentials_schema.append(credential)
        super().__init__(
@ -153,7 +153,7 @@ class BuiltinToolProviderController(ToolProviderController):
        """
        validate the credentials of the provider
-        :param tool_name: the name of the tool, defined in `get_tools`
+        :param user_id: use id
        :param credentials: the credentials of the tool
        """
        # validate credentials format
@ -167,7 +167,7 @@ class BuiltinToolProviderController(ToolProviderController):
        """
        validate the credentials of the provider
-        :param tool_name: the name of the tool, defined in `get_tools`
+        :param user_id: use id
        :param credentials: the credentials of the tool
        """
        pass
--- a/api/core/tools/builtin_tool/providers/webscraper/webscraper.yaml
+++ b/api/core/tools/builtin_tool/providers/webscraper/webscraper.yaml
@ -12,4 +12,4 @@ identity:
  icon: icon.svg
  tags:
    - productivity
-credentials_for_provider: []
+credentials_for_provider: {}
--- a/api/core/tools/builtin_tool/tool.py
+++ b/api/core/tools/builtin_tool/tool.py
@ -28,9 +28,7 @@ class BuiltinTool(Tool):
    def fork_tool_runtime(self, runtime: ToolRuntime) -> "BuiltinTool":
        """
-        fork a new tool with meta data
+        fork a new tool with metadata
        :param meta: the meta data of a tool call processing, tenant_id is required
        :return: the new tool
        """
        return self.__class__(
@ -43,7 +41,7 @@ class BuiltinTool(Tool):
        """
        invoke model
-        :param model_config: the model config
+        :param user_id: the user id
        :param prompt_messages: the prompt messages
        :param stop: the stop words
        :return: the model result
@ -64,7 +62,6 @@ class BuiltinTool(Tool):
        """
        get max tokens
        :param model_config: the model config
        :return: the max tokens
        """
        if self.runtime is None:
--- a/api/core/tools/custom_tool/provider.py
+++ b/api/core/tools/custom_tool/provider.py
@ -145,7 +145,6 @@ class ApiToolProviderController(ToolProviderController):
        """
        fetch tools from database
        :param user_id: the user id
        :param tenant_id: the tenant id
        :return: the tools
        """
--- a/api/core/tools/custom_tool/tool.py
+++ b/api/core/tools/custom_tool/tool.py
@ -35,9 +35,7 @@ class ApiTool(Tool):
    def fork_tool_runtime(self, runtime: ToolRuntime):
        """
-        fork a new tool with meta data
+        fork a new tool with metadata
        :param meta: the meta data of a tool call processing, tenant_id is required
        :return: the new tool
        """
        if self.api_bundle is None:
@ -195,7 +193,12 @@ class ApiTool(Tool):
                    properties = body_schema.get("properties", {})
                    for name, property in properties.items():
                        if name in parameters:
-                            if property.get("format") == "binary":
+                            # multiple file upload: if the type is array and the items have format as binary
                            if property.get("type") == "array" and property.get("items", {}).get("format") == "binary":
                                # parameters[name] should be a list of file objects.
                                for f in parameters[name]:
                                    files.append((name, (f.filename, download(f), f.mime_type)))
                            elif property.get("format") == "binary":
                                f = parameters[name]
                                files.append((name, (f.filename, download(f), f.mime_type)))
                            elif "$ref" in property:
@ -226,6 +229,13 @@ class ApiTool(Tool):
            else:
                body = body
        # if there is a file upload, remove the Content-Type header
        # so that httpx can automatically generate the boundary header required for multipart/form-data.
        # issue: https://github.com/langgenius/dify/issues/13684
        # reference: https://stackoverflow.com/questions/39280438/fetch-missing-boundary-in-multipart-form-data-post
        if files:
            headers.pop("Content-Type", None)
        if method in {
            "get",
            "head",
--- a/api/core/tools/entities/tool_entities.py
+++ b/api/core/tools/entities/tool_entities.py
@ -264,7 +264,7 @@ class ToolParameter(PluginParameter):
        :param name: the name of the parameter
        :param llm_description: the description presented to the LLM
-        :param type: the type of the parameter
+        :param typ: the type of the parameter
        :param required: if the parameter is required
        :param options: the options of the parameter
        """
--- a/api/core/tools/tool_engine.py
+++ b/api/core/tools/tool_engine.py
@ -313,7 +313,6 @@ class ToolEngine:
        """
        Create message file
        :param messages: messages
        :return: message file ids
        """
        result = []
--- a/api/core/tools/tool_manager.py
+++ b/api/core/tools/tool_manager.py
@ -161,8 +161,11 @@ class ToolManager:
        get the tool runtime
        :param provider_type: the type of the provider
-        :param provider_name: the name of the provider
+        :param provider_id: the id of the provider
        :param tool_name: the name of the tool
        :param tenant_id: the tenant id
        :param invoke_from: invoke from
        :param tool_invoke_from: the tool invoke from
        :return: the tool
        """
@ -427,8 +430,6 @@ class ToolManager:
        get the absolute path of the icon of the hardcoded provider
        :param provider: the name of the provider
        :param tenant_id: the id of the tenant
        :return: the absolute path of the icon, the mime type of the icon
        """
        # get provider
@ -672,7 +673,8 @@ class ToolManager:
        """
        get the api provider
-        :param provider_name: the name of the provider
+        :param tenant_id: the id of the tenant
        :param provider_id: the id of the provider
        :return: the provider controller, the credentials
        """
--- a/api/core/tools/utils/model_invocation_utils.py
+++ b/api/core/tools/utils/model_invocation_utils.py
@ -84,12 +84,8 @@ class ModelInvocationUtils:
        :param user_id: user id
        :param tenant_id: tenant id, the tenant id of the creator of the tool
-        :param tool_provider: tool provider
+        :param tool_type: tool type
        :param tool_id: tool id
        :param tool_name: tool name
        :param provider: model provider
        :param model: model name
        :param model_parameters: model parameters
        :param prompt_messages: prompt messages
        :return: AssistantPromptMessage
        """
--- a/api/core/tools/utils/parser.py
+++ b/api/core/tools/utils/parser.py
@ -186,6 +186,9 @@ class ApiBasedToolSchemaParser:
            return ToolParameter.ToolParameterType.BOOLEAN
        elif typ == "string":
            return ToolParameter.ToolParameterType.STRING
        elif typ == "array":
            items = parameter.get("items") or parameter.get("schema", {}).get("items")
            return ToolParameter.ToolParameterType.FILES if items and items.get("format") == "binary" else None
        else:
            return None
@ -197,6 +200,8 @@ class ApiBasedToolSchemaParser:
        parse openapi yaml to tool bundle
        :param yaml: the yaml string
        :param extra_info: the extra info
        :param warning: the warning message
        :return: the tool bundle
        """
        warning = warning if warning is not None else {}
@ -278,6 +283,8 @@ class ApiBasedToolSchemaParser:
        parse openapi plugin yaml to tool bundle
        :param json: the json string
        :param extra_info: the extra info
        :param warning: the warning message
        :return: the tool bundle
        """
        warning = warning if warning is not None else {}
@ -312,6 +319,8 @@ class ApiBasedToolSchemaParser:
        auto parse to tool bundle
        :param content: the content
        :param extra_info: the extra info
        :param warning: the warning message
        :return: tools bundle, schema_type
        """
        warning = warning if warning is not None else {}
--- a/api/core/tools/workflow_as_tool/provider.py
+++ b/api/core/tools/workflow_as_tool/provider.py
@ -182,7 +182,6 @@ class WorkflowToolProviderController(ToolProviderController):
        """
        fetch tools from database
        :param user_id: the user id
        :param tenant_id: the tenant id
        :return: the tools
        """
--- a/api/core/tools/workflow_as_tool/tool.py
+++ b/api/core/tools/workflow_as_tool/tool.py
@ -127,9 +127,8 @@ class WorkflowTool(Tool):
    def fork_tool_runtime(self, runtime: ToolRuntime) -> "WorkflowTool":
        """
-        fork a new tool with meta data
+        fork a new tool with metadata
        :param meta: the meta data of a tool call processing, tenant_id is required
        :return: the new tool
        """
        return self.__class__(
@ -212,7 +211,6 @@ class WorkflowTool(Tool):
        """
        extract files from the result
        :param result: the result
        :return: the result, files
        """
        files: list[File] = []
--- a/api/core/workflow/nodes/document_extractor/node.py
+++ b/api/core/workflow/nodes/document_extractor/node.py
@ -9,6 +9,7 @@ from typing import Any, cast
 import docx
 import pandas as pd
 import pypandoc  # type: ignore
 import pypdfium2  # type: ignore
 import yaml  # type: ignore
 from docx.document import Document
@ -369,7 +370,7 @@ def _extract_text_from_ppt(file_content: bytes) -> str:
    from unstructured.partition.ppt import partition_ppt
    try:
-        if dify_config.UNSTRUCTURED_API_URL and dify_config.UNSTRUCTURED_API_KEY:
+        if dify_config.UNSTRUCTURED_API_URL:
            with tempfile.NamedTemporaryFile(suffix=".ppt", delete=False) as temp_file:
                temp_file.write(file_content)
                temp_file.flush()
@ -378,7 +379,7 @@ def _extract_text_from_ppt(file_content: bytes) -> str:
                        file=file,
                        metadata_filename=temp_file.name,
                        api_url=dify_config.UNSTRUCTURED_API_URL,
-                        api_key=dify_config.UNSTRUCTURED_API_KEY,
+                        api_key=dify_config.UNSTRUCTURED_API_KEY,  # type: ignore
                    )
                os.unlink(temp_file.name)
        else:
@ -395,7 +396,7 @@ def _extract_text_from_pptx(file_content: bytes) -> str:
    from unstructured.partition.pptx import partition_pptx
    try:
-        if dify_config.UNSTRUCTURED_API_URL and dify_config.UNSTRUCTURED_API_KEY:
+        if dify_config.UNSTRUCTURED_API_URL:
            with tempfile.NamedTemporaryFile(suffix=".pptx", delete=False) as temp_file:
                temp_file.write(file_content)
                temp_file.flush()
@ -404,7 +405,7 @@ def _extract_text_from_pptx(file_content: bytes) -> str:
                        file=file,
                        metadata_filename=temp_file.name,
                        api_url=dify_config.UNSTRUCTURED_API_URL,
-                        api_key=dify_config.UNSTRUCTURED_API_KEY,
+                        api_key=dify_config.UNSTRUCTURED_API_KEY,  # type: ignore
                    )
                os.unlink(temp_file.name)
        else:
@ -416,9 +417,24 @@ def _extract_text_from_pptx(file_content: bytes) -> str:
 def _extract_text_from_epub(file_content: bytes) -> str:
    from unstructured.partition.api import partition_via_api
    from unstructured.partition.epub import partition_epub
    try:
        if dify_config.UNSTRUCTURED_API_URL:
            with tempfile.NamedTemporaryFile(suffix=".epub", delete=False) as temp_file:
                temp_file.write(file_content)
                temp_file.flush()
                with open(temp_file.name, "rb") as file:
                    elements = partition_via_api(
                        file=file,
                        metadata_filename=temp_file.name,
                        api_url=dify_config.UNSTRUCTURED_API_URL,
                        api_key=dify_config.UNSTRUCTURED_API_KEY,  # type: ignore
                    )
                os.unlink(temp_file.name)
        else:
            pypandoc.download_pandoc()
            with io.BytesIO(file_content) as file:
                elements = partition_epub(file=file)
        return "\n".join([str(element) for element in elements])
--- a/api/core/workflow/nodes/knowledge_retrieval/knowledge_retrieval_node.py
+++ b/api/core/workflow/nodes/knowledge_retrieval/knowledge_retrieval_node.py
@ -1,5 +1,6 @@
 import json
 import logging
 import re
 import time
 from collections import defaultdict
 from collections.abc import Mapping, Sequence
@ -331,8 +332,9 @@ class KnowledgeRetrievalNode(LLMNode):
            automatic_metadata_filters = self._automatic_metadata_filter_func(dataset_ids, query, node_data)
            if automatic_metadata_filters:
                conditions = []
-                for filter in automatic_metadata_filters:
+                for sequence, filter in enumerate(automatic_metadata_filters):
                    self._process_metadata_filter_func(
                        sequence,
                        filter.get("condition", ""),
                        filter.get("metadata_name", ""),
                        filter.get("value"),
@ -353,17 +355,26 @@ class KnowledgeRetrievalNode(LLMNode):
            if node_data.metadata_filtering_conditions:
                metadata_condition = MetadataCondition(**node_data.metadata_filtering_conditions.model_dump())
                if node_data.metadata_filtering_conditions:
-                    for condition in node_data.metadata_filtering_conditions.conditions:  # type: ignore
+                    for sequence, condition in enumerate(node_data.metadata_filtering_conditions.conditions):  # type: ignore
                        metadata_name = condition.name
                        expected_value = condition.value
                        if expected_value is not None or condition.comparison_operator in ("empty", "not empty"):
                            if isinstance(expected_value, str):
                                expected_value = self.graph_runtime_state.variable_pool.convert_template(
                                    expected_value
-                                ).text
+                                ).value[0]
-
+                                if expected_value.value_type == "number":  # type: ignore
                                    expected_value = expected_value.value  # type: ignore
                                elif expected_value.value_type == "string":  # type: ignore
                                    expected_value = re.sub(r"[\r\n\t]+", " ", expected_value.text).strip()  # type: ignore
                                else:
                                    raise ValueError("Invalid expected metadata value type")
                            filters = self._process_metadata_filter_func(
-                                condition.comparison_operator, metadata_name, expected_value, filters
+                                sequence,
                                condition.comparison_operator,
                                metadata_name,
                                expected_value,
                                filters,
                            )
        else:
            raise ValueError("Invalid metadata filtering mode")
@ -442,25 +453,35 @@ class KnowledgeRetrievalNode(LLMNode):
            return []
        return automatic_metadata_filters
-    def _process_metadata_filter_func(self, condition: str, metadata_name: str, value: Optional[str], filters: list):
+    def _process_metadata_filter_func(
        self, sequence: int, condition: str, metadata_name: str, value: Optional[Any], filters: list
    ):
        key = f"{metadata_name}_{sequence}"
        key_value = f"{metadata_name}_{sequence}_value"
        match condition:
            case "contains":
                filters.append(
-                    (text("documents.doc_metadata ->> :key LIKE :value")).params(key=metadata_name, value=f"%{value}%")
+                    (text(f"documents.doc_metadata ->> :{key} LIKE :{key_value}")).params(
                        **{key: metadata_name, key_value: f"%{value}%"}
                    )
                )
            case "not contains":
                filters.append(
-                    (text("documents.doc_metadata ->> :key NOT LIKE :value")).params(
+                    (text(f"documents.doc_metadata ->> :{key} NOT LIKE :{key_value}")).params(
-                        key=metadata_name, value=f"%{value}%"
+                        **{key: metadata_name, key_value: f"%{value}%"}
                    )
                )
            case "start with":
                filters.append(
-                    (text("documents.doc_metadata ->> :key LIKE :value")).params(key=metadata_name, value=f"{value}%")
+                    (text(f"documents.doc_metadata ->> :{key} LIKE :{key_value}")).params(
                        **{key: metadata_name, key_value: f"{value}%"}
                    )
                )
            case "end with":
                filters.append(
-                    (text("documents.doc_metadata ->> :key LIKE :value")).params(key=metadata_name, value=f"%{value}")
+                    (text(f"documents.doc_metadata ->> :{key} LIKE :{key_value}")).params(
                        **{key: metadata_name, key_value: f"%{value}"}
                    )
                )
            case "=" | "is":
                if isinstance(value, str):
--- a/api/core/workflow/utils/condition/processor.py
+++ b/api/core/workflow/utils/condition/processor.py
@ -375,11 +375,25 @@ def _process_sub_conditions(
    for condition in sub_conditions:
        key = FileAttribute(condition.key)
        values = [file_manager.get_attr(file=file, attr=key) for file in files]
        expected_value = condition.value
        if key == FileAttribute.EXTENSION:
            if not isinstance(expected_value, str):
                raise TypeError("Expected value must be a string when key is FileAttribute.EXTENSION")
            if expected_value and not expected_value.startswith("."):
                expected_value = "." + expected_value
            normalized_values = []
            for value in values:
                if value and isinstance(value, str):
                    if not value.startswith("."):
                        value = "." + value
                normalized_values.append(value)
            values = normalized_values
        sub_group_results = [
            _evaluate_condition(
                value=value,
                operator=condition.comparison_operator,
-                expected=condition.value,
+                expected=expected_value,
            )
            for value in values
        ]
--- a/api/core/workflow/utils/variable_template_parser.py
+++ b/api/core/workflow/utils/variable_template_parser.py
@ -95,7 +95,6 @@ class VariableTemplateParser:
        Args:
            inputs: A dictionary containing the values for the template variables.
            remove_template_variables: A boolean indicating whether to remove the template variables from the output.
        Returns:
            The formatted string with template variables replaced by their values.
--- a/api/core/workflow/workflow_entry.py
+++ b/api/core/workflow/workflow_entry.py
@ -204,6 +204,8 @@ class WorkflowEntry:
        NOTE: only parameter_extractor/question_classifier are supported
        :param node_data: node data
        :param node_id: node id
        :param tenant_id: tenant id
        :param user_id: user id
        :param user_inputs: user inputs
        :return:
--- a/api/factories/file_factory.py
+++ b/api/factories/file_factory.py
@ -196,7 +196,7 @@ def _build_from_remote_url(
        raise ValueError("Invalid file url")
    mime_type, filename, file_size = _get_remote_file_info(url)
-    extension = mimetypes.guess_extension(mime_type) or "." + filename.split(".")[-1] if "." in filename else ".bin"
+    extension = mimetypes.guess_extension(mime_type) or ("." + filename.split(".")[-1] if "." in filename else ".bin")
    file_type = FileType(mapping.get("type", "custom"))
    file_type = _standardize_file_type(file_type, extension=extension, mime_type=mime_type)
--- a/api/models/dataset.py
+++ b/api/models/dataset.py
@ -720,6 +720,23 @@ class DocumentSegment(db.Model):  # type: ignore[name-defined]
        else:
            return []
    def get_child_chunks(self):
        process_rule = self.document.dataset_process_rule
        if process_rule.mode == "hierarchical":
            rules = Rule(**process_rule.rules_dict)
            if rules.parent_mode:
                child_chunks = (
                    db.session.query(ChildChunk)
                    .filter(ChildChunk.segment_id == self.id)
                    .order_by(ChildChunk.position.asc())
                    .all()
                )
                return child_chunks or []
            else:
                return []
        else:
            return []
    @property
    def sign_content(self):
        return self.get_sign_content()
--- a/api/models/model.py
+++ b/api/models/model.py
@ -791,7 +791,7 @@ class Conversation(db.Model):  # type: ignore[name-defined]
            WorkflowRunStatus.SUCCEEDED: 0,
            WorkflowRunStatus.FAILED: 0,
            WorkflowRunStatus.STOPPED: 0,
-            WorkflowRunStatus.PARTIAL_SUCCESSED: 0,
+            WorkflowRunStatus.PARTIAL_SUCCEEDED: 0,
        }
        for message in messages:
@ -802,7 +802,7 @@ class Conversation(db.Model):  # type: ignore[name-defined]
            {
                "success": status_counts[WorkflowRunStatus.SUCCEEDED],
                "failed": status_counts[WorkflowRunStatus.FAILED],
-                "partial_success": status_counts[WorkflowRunStatus.PARTIAL_SUCCESSED],
+                "partial_success": status_counts[WorkflowRunStatus.PARTIAL_SUCCEEDED],
            }
            if messages
            else None
--- a/api/models/workflow.py
+++ b/api/models/workflow.py
@ -109,7 +109,7 @@ class Workflow(Base):
    tenant_id: Mapped[str] = mapped_column(StringUUID, nullable=False)
    app_id: Mapped[str] = mapped_column(StringUUID, nullable=False)
    type: Mapped[str] = mapped_column(db.String(255), nullable=False)
-    version: Mapped[str]
+    version: Mapped[str] = mapped_column(db.String(255), nullable=False)
    marked_name: Mapped[str] = mapped_column(default="", server_default="")
    marked_comment: Mapped[str] = mapped_column(default="", server_default="")
    graph: Mapped[str] = mapped_column(sa.Text)
@ -352,7 +352,7 @@ class WorkflowRunStatus(StrEnum):
    SUCCEEDED = "succeeded"
    FAILED = "failed"
    STOPPED = "stopped"
-    PARTIAL_SUCCESSED = "partial-succeeded"
+    PARTIAL_SUCCEEDED = "partial-succeeded"
 class WorkflowRun(Base):
@ -755,7 +755,8 @@ class WorkflowAppLog(Base):
    __tablename__ = "workflow_app_logs"
    __table_args__ = (
        db.PrimaryKeyConstraint("id", name="workflow_app_log_pkey"),
-        db.Index("workflow_app_log_app_idx", "tenant_id", "app_id"),
+        db.Index("workflow_app_log_app_idx", "tenant_id", "app_id", "created_at"),
        db.Index("workflow_app_log_workflow_run_idx", "workflow_run_id"),
    )
    id: Mapped[str] = mapped_column(StringUUID, server_default=db.text("uuid_generate_v4()"))
--- a/Show More
+++ b/Show More