2152 Commits

Author SHA1 Message Date
rafaelmmiller
9443a823b2 feat: script that generates all sdk examples for openapi 2025-05-30 17:09:03 -03:00
devin-ai-integration[bot]
ab30c8e4ac
Fix Supabase client configuration errors when USE_DB_AUTHENTICATION is false (#1534)
* Fix Supabase client configuration errors when USE_DB_AUTHENTICATION is false

Co-Authored-By: hello@sideguide.dev <hello+firecrawl@sideguide.dev>

* Add USE_DB_AUTHENTICATION checks to map and search controllers

Add test for USE_DB_AUTHENTICATION=false

Add USE_DB_AUTHENTICATION checks to billing services

Add USE_DB_AUTHENTICATION checks to batch_billing.ts

Add USE_DB_AUTHENTICATION checks to cached-docs.ts

Add USE_DB_AUTHENTICATION checks to supabase-jobs.ts

Add USE_DB_AUTHENTICATION checks to team-id-sync.ts

Add USE_DB_AUTHENTICATION checks to test-suite log.ts

Add USE_DB_AUTHENTICATION checks to idempotency services

Co-Authored-By: hello@sideguide.dev <hello+firecrawl@sideguide.dev>

* Revert "Add USE_DB_AUTHENTICATION checks to map and search controllers"

This reverts commit 834a5d51a68c74ada67800fa3a0aa45bde22d745.

---------

Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-authored-by: hello@sideguide.dev <hello+firecrawl@sideguide.dev>
Co-authored-by: Nicolas <nicolascamara29@gmail.com>
Co-authored-by: Gergő Móricz <mo.geryy@gmail.com>
2025-05-16 12:56:33 -03:00
devin-ai-integration[bot]
526165e1b9
Add caching for RunPod PDF markdown results in GCS (#1561)
* Add caching for RunPod PDF markdown results in GCS

Co-Authored-By: thomas@sideguide.dev <thomas@sideguide.dev>

* Update PDF caching to hash base64 directly and add metadata

Co-Authored-By: thomas@sideguide.dev <thomas@sideguide.dev>

* Fix PDF caching to directly hash content and fix test expectations

Co-Authored-By: thomas@sideguide.dev <thomas@sideguide.dev>

---------

Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-authored-by: thomas@sideguide.dev <thomas@sideguide.dev>
2025-05-16 12:04:38 -03:00
Gergő Móricz
bd9673e104
Mog/cachable lookup (#1560)
* feat(scrapeURL): use cacheableLookup

* feat(queue-worker): add cacheablelookup

* fix(cacheable-lookup): make it work with tailscale on local

* add devenv

* try again

* allow querying all

* log

* fixes

* asd

* fix:

* fix(lookup):

* lookup
2025-05-16 15:44:52 +02:00
Gergő Móricz
d46ba95924 Revert "feat: use cacheable lookup everywhere (#1559)"
This reverts commit b8703b2a720765b92f5c4cab94cc90ea624198a8.
2025-05-16 15:31:06 +02:00
Gergő Móricz
b8703b2a72
feat: use cacheable lookup everywhere (#1559)
* feat(scrapeURL): use cacheableLookup

* feat(queue-worker): add cacheablelookup

* fix(cacheable-lookup): make it work with tailscale on local

* add devenv

* try again

* allow querying all

* log

* fixes

* asd

* fix:

* fix(lookup):
2025-05-16 15:27:24 +02:00
Gergő Móricz
f936befcdb feat(queue-worker): liveness check endpoint 2025-05-16 14:15:48 +02:00
Gergő Móricz
b5b612c35b
feat(api/extract/fire-0): error logging (#1556) 2025-05-15 11:32:59 -03:00
Will
b0c203e512
Fix/optional chaining operators missing (#1549)
* fix: missing optional chaining operator in req.acuc.flags

* fix: missing optional chaining operator in req.acuc.flags
2025-05-15 00:04:04 +02:00
Gergő Móricz
cee481a3a9 fix(fire-engine): sslerror passthrough 2025-05-14 23:50:57 +02:00
Gergő Móricz
3db2294b97
feat(scrapeURL): better error for SSL failures (#1552) 2025-05-14 23:34:59 +02:00
Ademílson Tonato
06189b9646
refactor: increase max limit for search request schema from 50 to 100 (#1545) 2025-05-13 17:40:32 -03:00
Yohann Prigent
505924875e
create openAI provider using base url parameter (#1480)
Co-authored-by: Yohann Prigent <yohann@pandascore.co>
2025-05-12 20:43:07 +02:00
Gergő Móricz
0fd05a67a0 Revert "Revert "fix(queue-worker, scrape): match billing logic and add billing for stealth proxies (#1521)""
This reverts commit 017a915ae8f550ceaa01ad607b4e6a684385eadf.
2025-05-12 17:46:09 +02:00
Gergő Móricz
fdeb01847d feat(queue-worker): add more logs around crawl finishing logic 2025-05-09 16:52:38 +02:00
Nicolas
7b03ab36a7 Update openapi.json 2025-05-08 20:15:49 -03:00
Gergő Móricz
fa581995e6
feat(acuc): propagate team flags (FIR-1879) (#1522)
* feat(acuc): propagate team flags

* feat(flags): further functionality
2025-05-08 20:23:35 +02:00
Gergő Móricz
017a915ae8 Revert "fix(queue-worker, scrape): match billing logic and add billing for stealth proxies (#1521)"
This reverts commit e06c7cc234b9ee0c4bf112cc338e515f28674e11.
2025-05-08 18:34:13 +02:00
Gergő Móricz
e06c7cc234
fix(queue-worker, scrape): match billing logic and add billing for stealth proxies (#1521) 2025-05-08 10:51:38 -03:00
Gergő Móricz
0f32500149 fix(queue-jobs): never cc timeout jobs that are crawl-associated (makes no sense) 2025-05-08 12:54:23 +02:00
Gergő Móricz
7ad9a00ea8
fix(concurrency-limit): rework cc queue to work by time not priority (#1526) 2025-05-08 12:40:13 +02:00
Ademílson Tonato
ae12c326f0
refactor: maximum links limit for map endpoint from 5000 to 30000 2025-05-06 16:00:15 -04:00
Nicolas
17728379df Revert "Nick: log su usage"
This reverts commit 6567ef81f6ca6dfd5a92eafb04f4eed8d37e0a6c.
2025-05-05 18:19:15 -03:00
Nicolas
6567ef81f6 Nick: log su usage 2025-05-05 18:00:34 -03:00
devin-ai-integration[bot]
411ecdf04b
Add crawl delay functionality with per-crawl concurrency limiting (FIR-249) (#1413)
* feat: Add crawl delay functionality with per-crawl concurrency limiting (FIR-249)

Co-Authored-By: mogery@sideguide.dev <mogery@sideguide.dev>

* fix: Skip crawl delay in test environment to fix CI tests

Co-Authored-By: mogery@sideguide.dev <mogery@sideguide.dev>

* refactor: Use crawlerOptions.delay instead of separate fields

Co-Authored-By: mogery@sideguide.dev <mogery@sideguide.dev>

* refactor: Rename crawlDelay to delay in type definitions for uniformity

Co-Authored-By: mogery@sideguide.dev <mogery@sideguide.dev>

* refactor: Fix crawl concurrency implementation based on PR feedback

Co-Authored-By: mogery@sideguide.dev <mogery@sideguide.dev>

* refactor: Simplify if/else structure in queue-jobs.ts based on PR feedback

Co-Authored-By: mogery@sideguide.dev <mogery@sideguide.dev>

* human fixes

* test: Add tests for crawl delay functionality

Co-Authored-By: mogery@sideguide.dev <mogery@sideguide.dev>

* test: Move crawl delay tests to existing crawl.test.ts file

Co-Authored-By: mogery@sideguide.dev <mogery@sideguide.dev>

* fix: Ensure sitemapped URLs are added to crawl concurrency queue and update crawl status endpoint

Co-Authored-By: mogery@sideguide.dev <mogery@sideguide.dev>

* dbg

* fix: Ensure jobs with crawl delay are properly added to BullMQ

Co-Authored-By: mogery@sideguide.dev <mogery@sideguide.dev>

* fix: Remove duplicate job addition to BullMQ for jobs with crawl delay

Co-Authored-By: mogery@sideguide.dev <mogery@sideguide.dev>

* fixes

* warning for devin

* test: Simplify crawl delay test as requested in PR feedback

Co-Authored-By: mogery@sideguide.dev <mogery@sideguide.dev>

* bump delay test timeout

* fix operation order

* bump further???

* fix: broken on self-host

* Update apps/api/src/services/queue-jobs.ts

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* fix: import

---------

Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-authored-by: mogery@sideguide.dev <mogery@sideguide.dev>
Co-authored-by: Gergő Móricz <mo.geryy@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-05-02 17:20:57 +02:00
Nicolas
ac02ad3257 Revert "Nick: past_due changes"
This reverts commit 9449d7b020798b9368ed5e1bae3f5b2a32cb24c7.
2025-05-01 16:48:38 -03:00
Nicolas
9449d7b020 Nick: past_due changes 2025-05-01 16:42:20 -03:00
Nicolas
8b88c26a47
Update v1.ts (#1509) 2025-04-30 13:02:07 -03:00
Rafael Miller
eee613d1bc
[feat] Implement GCS storage option for scrape results across controllers an… (#1500)
* Implement GCS storage option for scrape results across controllers and update GCS document retrieval functionality

* done!

* Update gcs-jobs.ts
2025-04-29 15:15:44 -03:00
Gergő Móricz
8b82e11625 Decrease diff warn threshold further 2025-04-24 14:12:52 +02:00
Gergő Móricz
e4f9a92e98 adjust diff warn threshold 2025-04-23 23:54:46 +02:00
Gergő Móricz
0e7f2c8599 feat(diff): log if it takes a long time with params 2025-04-23 20:46:51 +02:00
Rafael Miller
37dabce1ed
[feat] added second scrapeURLWithFireEngine (#1494) 2025-04-23 20:36:36 +02:00
Gergő Móricz
9435c800c0 fix(api/tests/scrape): don't test scrape status endpoint on self-host env 2025-04-23 20:36:22 +02:00
Nicolas
2f6520bc38 Merge branch 'main' of https://github.com/mendableai/firecrawl 2025-04-23 14:19:26 -04:00
Nicolas
22f7efed35 NIck: rm scrape events 2025-04-23 14:19:25 -04:00
Nicolas
1c421f2d74
Nick: (#1492) 2025-04-22 21:42:37 -04:00
Nicolas
e532a96b0c
(fix/search) Search logs fix (#1491)
* Update search.ts

* Update search.ts
2025-04-22 17:12:10 -04:00
Gergő Móricz
1a02ef56e6 fix(extract/fire-1): thinking tokens pricing 2025-04-22 01:00:00 +02:00
Gergő Móricz
f47d8779f5 fix(extract/oldExtract): do if for old/new extract 2025-04-22 00:53:17 +02:00
Gergő Móricz
df305c2a97 fix(): null-proof/nan-proof 2025-04-19 12:35:55 -07:00
Gergő Móricz
0e027fe430 fix(scrape): bill llm on fail 2025-04-19 02:56:18 -07:00
Gergő Móricz
b35462456e billing for smart scrape 2025-04-19 02:06:30 -07:00
Gergő Móricz
0a86d2738f reenable 2025-04-19 01:57:58 -07:00
Gergő Móricz
324b4e2e1e fix rounding 2025-04-19 01:45:36 -07:00
Gergő Móricz
6d2347b5f8 feat(llmExtract): add token tracking to all calls 2025-04-19 01:38:55 -07:00
Gergő Móricz
438ea19f16 feat(extract): add thinking tokens 2025-04-19 01:35:17 -07:00
Gergő Móricz
653a0207c3 fix(calculateCost): accuracy 2.5 2025-04-19 01:10:32 -07:00
Gergő Móricz
2b2e648d41 feat(smart-scrape): log failed costs 2025-04-19 01:07:00 -07:00
Gergő Móricz
3909b6641d temp: 2025-04-19 00:19:25 -07:00