3460 Commits

Author SHA1 Message Date
Nicolas
547c09c54c
Merge pull request #1087 from mendableai/docs/update-cancel-crawl-response
docs: update cancel crawl response
2025-01-24 13:34:13 -03:00
Ademílson Tonato
34e3911a97
docs: update cancel crawl response
- add cancel crawl event to requests.http
2025-01-24 16:16:17 +00:00
rafaelmmiller
3184e91f66 layers 2025-01-24 10:25:45 -03:00
rafaelmmiller
64d116540f rerank with lower threshold + back to map if lenght = 0 2025-01-24 09:08:16 -03:00
Móricz Gergő
05d79a875a fix(extract): oops 2025-01-24 11:55:41 +01:00
Móricz Gergő
4db9a4a675 fix(extraction-service): allow no multiEntityKeys if isMultiEntity is false 2025-01-24 11:33:49 +01:00
Móricz Gergő
0dddf4c055 fix(v1/extract): add job with explicit id 2025-01-24 11:03:04 +01:00
Rafael Miller
3f9b8a0bf5
Merge pull request #1084 from mendableai/added-today-to-extract-prompts
Added "today" to extract prompts
2025-01-23 17:16:15 -03:00
rafaelmmiller
f1cd891a70 added today to extract prompts 2025-01-23 17:14:45 -03:00
Gergő Móricz
a1efe33c8a fix(scrapeQueue): change expiry to 1 hour 2025-01-23 20:30:20 +01:00
Gergő Móricz
a7b56ab87c feat(crawl-status): same for v0 2025-01-23 19:39:33 +01:00
Gergő Móricz
95ce3c3b71 feat(crawl-status): allow for jobs to expire out of the redis 2025-01-23 19:33:43 +01:00
Gergő Móricz
6f696d32ae feat(extract): add log on 0 links 2025-01-23 19:25:12 +01:00
Gergő Móricz
5d56627bfa feat(extraction-service): highlight req schema generation 2025-01-23 19:24:24 +01:00
Móricz Gergő
9da51a7514 feat(extract): add original schema to logs 2025-01-23 14:59:54 +01:00
Móricz Gergő
561f0186ef fix build error 2025-01-23 12:07:37 +01:00
Móricz Gergő
6557365149 feat(sitemap): change sitemap logging 2025-01-23 12:06:50 +01:00
Móricz Gergő
d3518e85a8 feat(extract): add logging 2025-01-23 12:05:15 +01:00
Móricz Gergő
434a435a4b fix(sitemap): increase limit to 20 2025-01-23 11:29:49 +01:00
Móricz Gergő
1e28ba291e fix(sitemap): increase limit 2025-01-23 09:21:38 +01:00
Móricz Gergő
bee2b2873e fix(sitemap): better ordering 2025-01-23 08:58:18 +01:00
Móricz Gergő
3761eb17a7 feat(sitemap): reenable fallback to tlsclient 2025-01-23 08:43:13 +01:00
Móricz Gergő
72198123cb fix(crawler): move sitemap deduplication to deeper in the process 2025-01-23 08:10:46 +01:00
Móricz Gergő
aa2c369060 feat(sitemap): propagate crawlid 2025-01-23 07:19:00 +01:00
Móricz Gergő
a922aac805 fix(crawler): dumb sitemap limit 2025-01-23 07:10:07 +01:00
Móricz Gergő
51a0e233e3 fix(sitemap): temporarily disable tlsclient 2025-01-23 06:56:15 +01:00
Nicolas
d162247703 Update cache.ts 2025-01-23 02:37:04 -03:00
Nicolas
ccb74a2b43 Nick: increased timeouts on extract + reduced extract redis usage 2025-01-23 01:28:26 -03:00
Nicolas
498558d358 Nick: formatting done 2025-01-22 18:47:44 -03:00
Nicolas
994e1eb502 Nick: rm logs 2025-01-22 17:27:48 -03:00
Nicolas
56f048aeff Reapply "Nick:"
This reverts commit 4b4385c520c7223cf79ebba981dded8ffaefde11.
2025-01-22 17:26:32 -03:00
Nicolas
4b4385c520 Revert "Nick:"
This reverts commit 6718ce89085339eaaceb1e88a0aa45ecff3216ac.
2025-01-22 17:26:09 -03:00
Nicolas
e1ef826ac6 Merge branch 'main' of https://github.com/mendableai/firecrawl 2025-01-22 17:25:49 -03:00
Nicolas
6718ce8908 Nick: 2025-01-22 17:25:48 -03:00
Gergő Móricz
208bd4ca0c fix(extraction-service): marginally improve logging 2025-01-22 19:38:09 +01:00
Gergő Móricz
ed929221ab feat(sitemap): switch around engine order 2025-01-22 19:10:27 +01:00
Gergő Móricz
5a039e7b64 fix(v1/map): add wrapper around tryGetSitemap 2025-01-22 19:00:46 +01:00
Nicolas
5aad21b35a
Update extract.ts 2025-01-22 11:01:10 -03:00
Eric Ciarla
669c694b32 R1 Crawler 2025-01-22 09:37:10 -03:00
Nicolas
04916f17e2 Nick: bug fixes + acuc fixes + cache fixes 2025-01-21 19:17:06 -03:00
Nicolas
3604f2a3ae Nick: misc improvements 2025-01-21 16:57:45 -03:00
Nicolas
ac0d10c451 Nick: sitemap fetch only below threshold for /map 2025-01-21 16:28:57 -03:00
Nicolas
c7b219169b Nick: fixed crawl maps index dedup 2025-01-21 16:22:27 -03:00
Nicolas
720a429115 Nick: temp fix 2025-01-21 13:23:34 -03:00
Nicolas
2b9f63cf10 Nick: more permissive re-ranker 2025-01-21 11:30:54 -03:00
Gergő Móricz
dcbe0b319c fix(v1/crawl-status-ws): wait to send catchup before closing 2025-01-20 20:01:27 +01:00
Nicolas
16af54cf2b Nick: bump sdks 2025-01-20 13:43:31 -03:00
Nicolas
ef69b1ac88 Nick: allowExternalLinks is now enableWebSearch 2025-01-20 13:41:30 -03:00
Nicolas
a5d379c935 Update package.json 2025-01-20 13:31:08 -03:00
Nicolas
1aa3c0ab5c Merge branch 'main' of https://github.com/mendableai/firecrawl 2025-01-20 13:29:00 -03:00