226 Commits

Author SHA1 Message Date
Bowen Liang
924b4fe742
test: run vdb tests on TiDB Vector with docker in CI tests (#11645) 2024-12-15 17:16:40 +08:00
yihong
22258fb0bf
fix: filter bug for keywork cause code can not reach (#11666)
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2024-12-15 17:12:06 +08:00
yihong
36cb25b341
fix: support mdx files close #11557 (#11565)
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2024-12-12 13:37:56 +08:00
Jiang
0d04cdc323
Lindorm vdb (#11574)
Co-authored-by: jiangzhijie <jiangzhijie.jzj@alibaba-inc.com>
2024-12-12 09:43:27 +08:00
Jyong
9b7adcd4d9
update tidb batch get endpoint to basic mode (#11426) 2024-12-06 17:06:46 +08:00
Jyong
d7c1f43b49
fix tidb full-text-search vector missed (#11337) 2024-12-04 16:13:23 +08:00
Jyong
c58d2fce89
roll back rerank topn setting (#11297) 2024-12-03 17:34:56 +08:00
yihong
e686f12317
fix: better handle error (#11265)
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2024-12-03 09:15:38 +08:00
-LAN-
9601102885
fix(word_extractor): Fix type error and remove stream in ssrf_proxy (#11241)
Signed-off-by: -LAN- <laipz8200@outlook.com>
2024-12-02 10:24:03 +08:00
Cling_o3
f9c2aa7689
feat: add retireval_top_n to config in env (#11132) 2024-11-30 11:14:45 +08:00
kazuya-awano
2d6865d421
Ensure consistent float type for cached embedding return values (#10185) 2024-11-29 09:18:41 +08:00
yihong
d7160ee563
fix: typo in upstashVector if id is always true, also fix some type hint (#11183)
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2024-11-28 14:05:25 +08:00
-LAN-
9789905a1f
chore(*): Removes debugging print statements (#11145)
Signed-off-by: -LAN- <laipz8200@outlook.com>
2024-11-26 22:03:19 +08:00
Bowen Liang
6c8e208ef3
chore: bump minimum supported Python version to 3.11 (#10386) 2024-11-24 13:28:46 +08:00
yihong
ed55de888a
fix: rules should not be None for in (#10977)
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2024-11-22 23:04:20 +08:00
AkisAya
cb0c55daa7
fix weight rerank of knowledge retrieval (#10931) 2024-11-21 17:53:20 +08:00
yihong
58a9d9eb9a
fix: better WeightRerankRunner run logic use O(1) and delete unused code (#10849)
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2024-11-19 20:12:13 +08:00
Zane
14f3d44c37
refactor: improve handling of leading punctuation removal (#10761) 2024-11-18 21:32:33 +08:00
8bitpd
873e9720e9
feat: AnalyticDB vector store supports invocation via SQL. (#10802)
Co-authored-by: 璟义 <yangshangpo.ysp@alibaba-inc.com>
2024-11-18 19:29:54 +08:00
Bowen Liang
51db59622c
chore(lint): cleanup repeated cause exception in logging.exception replaced by helpful message (#10425) 2024-11-15 15:41:40 +08:00
Jyong
0b2d51d859
add the index field for elasticsearch (#10592) 2024-11-12 21:43:16 +08:00
-LAN-
a1543b7da0
fix(extractor): temporary file (#10543) 2024-11-11 17:31:27 +08:00
Leo.Wang
c9f785e00f
Feat/tools/gitlab (#10407) 2024-11-08 09:53:03 +08:00
Bowen Liang
574c4a264f
chore(lint): Use logging.exception instead of logging.error (#10415) 2024-11-07 21:13:02 +08:00
Jyong
1024fc623e
fix the ssrf of docx file extractor external images (#10237) 2024-11-04 15:22:07 +08:00
Jiang
0c9e79cd67
Add Lindorm as a VDB choice (#10202)
Co-authored-by: jiangzhijie <jiangzhijie.jzj@alibaba-inc.com>
2024-11-04 09:10:26 +08:00
Shili Cao
b61baa87ec
fix: avoid unexpected error when create knowledge base with baidu vector database and wenxin embedding model (#10130) 2024-10-31 21:34:23 +08:00
Jyong
dad041c49f
fix issue: query is none when doing retrieval (#10129) 2024-10-31 21:25:00 +08:00
omr
11ca1bec0b
fix: optimize unique document filtering with set (#10082) 2024-10-31 16:32:58 +08:00
zhuhao
7433095240
chore: use dify_config.TIDB_SPEND_LIMIT instead of constant value (#10038) 2024-10-30 15:43:07 +08:00
Jyong
9ebd453b87
add rerank check when doing mutil-retrieval (#9998) 2024-10-30 11:17:39 +08:00
powerfool
878d13ef42
Added OceanBase as an option for the vector store in Dify (#10010) 2024-10-29 21:10:18 +08:00
Jyong
5580bcf870
add tidb spend limit config (#9999) 2024-10-29 17:51:13 +08:00
roadgoat19
c8ef9223e5
feat: couchbase integration (#6165)
Co-authored-by: crazywoola <427733928@qq.com>
Co-authored-by: Elliot Scribner <elliot.scribner@couchbase.com>
Co-authored-by: crazywoola <100913391+crazywoola@users.noreply.github.com>
Co-authored-by: Bowen Liang <bowenliang@apache.org>
2024-10-29 15:00:23 +08:00
Jyong
f47177ecb4
add top_k for es full text search (#9963) 2024-10-28 23:04:54 +08:00
virgosoy
17cacf258e
fix: wrong element object (#9868) 2024-10-25 22:32:41 +08:00
Jyong
18106a4fc6
add tidb on qdrant type (#9831)
Co-authored-by: Zhaofeng Miao <522856232@qq.com>
2024-10-25 13:57:03 +08:00
Zixuan Cheng
88dec6ef2b
Added description for .ppt, specify the reason for unstructured.io (#9452)
Co-authored-by: crazywoola <427733928@qq.com>
2024-10-24 22:13:06 +08:00
Jyong
5f11fe521d
remove unstructured pdf extract (#9794) 2024-10-24 18:13:05 +08:00
Jyong
3e9d271b52
nltk security issue and upgrade unstructured (#9558) 2024-10-23 16:23:55 +08:00
ice yao
ceb2c4f3ef
chore: reuse existing test functions with upstash vdb (#9679) 2024-10-23 10:42:11 +08:00
Zven
8e7a752b2a
feat: add upstash as a new vector database provider (#9644) 2024-10-23 09:16:35 +08:00
-LAN-
5f12c17355
fix(core): use CreatedByRole enum for role consistency (#9607) 2024-10-22 13:03:50 +08:00
Bowen Liang
4d9160ca9f
refactor: use dify_config to replace legacy usage of flask app's config (#9089) 2024-10-22 11:01:32 +08:00
-LAN-
e61752bd3a
feat/enhance the multi-modal support (#8818) 2024-10-21 10:43:49 +08:00
ice yao
2155bba5b0
fix: update mismatch vector type (#9462) 2024-10-18 08:21:41 +08:00
zhuhao
b90ad587c2
refactor: move the embedding to the rag module and abstract the rerank runner for extension (#9423) 2024-10-17 19:12:42 +08:00
zhuhao
86594851cb
refactor: update the default values of top-k parameter in vdb to be consistent (#9367) 2024-10-16 16:00:21 +08:00
Jyong
50635e9c15
Fix/economical knowledge retrieval (#9396) 2024-10-16 15:13:45 +08:00
zhuhao
cd7ab6231f
refactor: Add an enumeration type and use the factory pattern to obtain the corresponding class (#9356) 2024-10-15 12:51:13 +08:00