Gergő Móricz
fa581995e6
feat(acuc): propagate team flags (FIR-1879) ( #1522 )
...
* feat(acuc): propagate team flags
* feat(flags): further functionality
2025-05-08 20:23:35 +02:00
Gergő Móricz
017a915ae8
Revert "fix(queue-worker, scrape): match billing logic and add billing for stealth proxies ( #1521 )"
...
This reverts commit e06c7cc234b9ee0c4bf112cc338e515f28674e11.
2025-05-08 18:34:13 +02:00
Gergő Móricz
e06c7cc234
fix(queue-worker, scrape): match billing logic and add billing for stealth proxies ( #1521 )
2025-05-08 10:51:38 -03:00
Gergő Móricz
0f32500149
fix(queue-jobs): never cc timeout jobs that are crawl-associated (makes no sense)
2025-05-08 12:54:23 +02:00
Gergő Móricz
7ad9a00ea8
fix(concurrency-limit): rework cc queue to work by time not priority ( #1526 )
2025-05-08 12:40:13 +02:00
Ademílson Tonato
5d07cccd65
Merge pull request #1523 from mendableai/refactor/map-limit
...
refactor: maximum links limit for map endpoint from 5000 to 30000
2025-05-06 21:06:19 +01:00
Ademílson Tonato
ae12c326f0
refactor: maximum links limit for map endpoint from 5000 to 30000
2025-05-06 16:00:15 -04:00
Nicolas
17728379df
Revert "Nick: log su usage"
...
This reverts commit 6567ef81f6ca6dfd5a92eafb04f4eed8d37e0a6c.
2025-05-05 18:19:15 -03:00
Nicolas
6567ef81f6
Nick: log su usage
2025-05-05 18:00:34 -03:00
devin-ai-integration[bot]
0512ad6bce
Add delay parameter to crawl options in all SDKs ( #1514 )
...
* Add delay parameter to crawl options in all SDKs
Co-Authored-By: mogery@sideguide.dev <mogery@sideguide.dev>
* Update terminology from 'between crawl requests' to 'between scrapes'
Co-Authored-By: mogery@sideguide.dev <mogery@sideguide.dev>
* Apply suggestions from code review
---------
Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-authored-by: mogery@sideguide.dev <mogery@sideguide.dev>
Co-authored-by: Gergő Móricz <mo.geryy@gmail.com>
2025-05-02 18:00:15 +02:00
devin-ai-integration[bot]
411ecdf04b
Add crawl delay functionality with per-crawl concurrency limiting (FIR-249) ( #1413 )
...
* feat: Add crawl delay functionality with per-crawl concurrency limiting (FIR-249)
Co-Authored-By: mogery@sideguide.dev <mogery@sideguide.dev>
* fix: Skip crawl delay in test environment to fix CI tests
Co-Authored-By: mogery@sideguide.dev <mogery@sideguide.dev>
* refactor: Use crawlerOptions.delay instead of separate fields
Co-Authored-By: mogery@sideguide.dev <mogery@sideguide.dev>
* refactor: Rename crawlDelay to delay in type definitions for uniformity
Co-Authored-By: mogery@sideguide.dev <mogery@sideguide.dev>
* refactor: Fix crawl concurrency implementation based on PR feedback
Co-Authored-By: mogery@sideguide.dev <mogery@sideguide.dev>
* refactor: Simplify if/else structure in queue-jobs.ts based on PR feedback
Co-Authored-By: mogery@sideguide.dev <mogery@sideguide.dev>
* human fixes
* test: Add tests for crawl delay functionality
Co-Authored-By: mogery@sideguide.dev <mogery@sideguide.dev>
* test: Move crawl delay tests to existing crawl.test.ts file
Co-Authored-By: mogery@sideguide.dev <mogery@sideguide.dev>
* fix: Ensure sitemapped URLs are added to crawl concurrency queue and update crawl status endpoint
Co-Authored-By: mogery@sideguide.dev <mogery@sideguide.dev>
* dbg
* fix: Ensure jobs with crawl delay are properly added to BullMQ
Co-Authored-By: mogery@sideguide.dev <mogery@sideguide.dev>
* fix: Remove duplicate job addition to BullMQ for jobs with crawl delay
Co-Authored-By: mogery@sideguide.dev <mogery@sideguide.dev>
* fixes
* warning for devin
* test: Simplify crawl delay test as requested in PR feedback
Co-Authored-By: mogery@sideguide.dev <mogery@sideguide.dev>
* bump delay test timeout
* fix operation order
* bump further???
* fix: broken on self-host
* Update apps/api/src/services/queue-jobs.ts
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* fix: import
---------
Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-authored-by: mogery@sideguide.dev <mogery@sideguide.dev>
Co-authored-by: Gergő Móricz <mo.geryy@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-05-02 17:20:57 +02:00
Eric Ciarla
510171cabe
Delete qwen 3
2025-05-01 16:54:12 -04:00
Eric Ciarla
a0ed76d53d
Merge pull request #1510 from mendableai/devin/1746062769-qwen3-web-crawler-example
...
Add Qwen3 web crawler example using OpenRouter
2025-05-01 16:34:43 -04:00
Nicolas
ac02ad3257
Revert "Nick: past_due changes"
...
This reverts commit 9449d7b020798b9368ed5e1bae3f5b2a32cb24c7.
2025-05-01 16:48:38 -03:00
Nicolas
9449d7b020
Nick: past_due changes
2025-05-01 16:42:20 -03:00
Devin AI
b646c3e9d9
Remove OpenAI API key references, use only OpenRouter API key
...
Co-Authored-By: eric@sideguide.dev <eric@sideguide.dev>
2025-05-01 01:34:06 +00:00
Devin AI
018c6a616e
Add Qwen3 web crawler example using OpenRouter
...
Co-Authored-By: eric@sideguide.dev <eric@sideguide.dev>
2025-05-01 01:26:09 +00:00
Nicolas
8b88c26a47
Update v1.ts ( #1509 )
2025-04-30 13:02:07 -03:00
Rafael Miller
eee613d1bc
[feat] Implement GCS storage option for scrape results across controllers an… ( #1500 )
...
* Implement GCS storage option for scrape results across controllers and update GCS document retrieval functionality
* done!
* Update gcs-jobs.ts
2025-04-29 15:15:44 -03:00
devin-ai-integration[bot]
f0b1507290
Fix: Handle both dict and model instances in actions parameter ( #1508 )
...
* Fix: Handle both dict and model instances in actions parameter
Co-Authored-By: Nicolas Camara <nicolascamara29@gmail.com>
* Update __init__.py
---------
Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-authored-by: Nicolas Camara <nicolascamara29@gmail.com>
2025-04-29 13:06:12 -03:00
Nicolas
6dbfd54e2c
Update __init__.py
2025-04-29 12:22:30 -03:00
Rafael Miller
317fa43f9e
Fix sdk/schemas ( #1507 )
...
* sdk-fix/schema-check
* version bump
* schema validation for extract and jsonOptions parameters
* Update firecrawl.py
---------
Co-authored-by: Nicolas <nicolascamara29@gmail.com>
2025-04-29 12:19:08 -03:00
Nicolas
a0a1675829
Nick: ( #1506 )
2025-04-29 11:06:35 -03:00
Nicolas
8053a7cedd
Nick: updates on pypi
v1.8.0
2025-04-28 15:01:12 -03:00
Arvid Andersson
c164370298
Webhook param for batch scrape ( #1505 )
...
The API endpoint supports the webhook param, align the client to support this.
2025-04-28 13:27:10 -03:00
Rafael Miller
e3e730f2c1
Update version to 2.4.0 and enhance ExtractResponse model with additional fields for id, status, and expiresAt. ( #1501 )
2025-04-25 23:19:08 -04:00
Eric Ciarla
f7a9a14410
Merge pull request #1489 from aparupganguly/feature/o4-mini-web-crawler
...
Add examples/o4-mini web crawler
2025-04-24 13:46:20 -04:00
Eric Ciarla
59e1343ed4
Merge pull request #1487 from aparupganguly/feature/o3-crawler
...
Add examples/o3 Web Crawler
2025-04-24 13:46:10 -04:00
Nicolas
5c3951b42e
Update __init__.py
2025-04-24 12:31:02 -04:00
John Bledsoe
ca82015bca
Use async job status monitor for AsyncFirecrawlApp ( #1498 )
2025-04-24 12:28:29 -04:00
Gergő Móricz
8b82e11625
Decrease diff warn threshold further
2025-04-24 14:12:52 +02:00
Gergő Móricz
e4f9a92e98
adjust diff warn threshold
2025-04-23 23:54:46 +02:00
Gergő Móricz
0e7f2c8599
feat(diff): log if it takes a long time with params
2025-04-23 20:46:51 +02:00
Rafael Miller
37dabce1ed
[feat] added second scrapeURLWithFireEngine ( #1494 )
2025-04-23 20:36:36 +02:00
Gergő Móricz
9435c800c0
fix(api/tests/scrape): don't test scrape status endpoint on self-host env
2025-04-23 20:36:22 +02:00
Nicolas
2f6520bc38
Merge branch 'main' of https://github.com/mendableai/firecrawl
2025-04-23 14:19:26 -04:00
Nicolas
22f7efed35
NIck: rm scrape events
2025-04-23 14:19:25 -04:00
Nicolas
1c421f2d74
Nick: ( #1492 )
2025-04-22 21:42:37 -04:00
Nicolas
feda4dede7
Update README.md
2025-04-22 17:34:13 -04:00
Nicolas
e532a96b0c
(fix/search) Search logs fix ( #1491 )
...
* Update search.ts
* Update search.ts
2025-04-22 17:12:10 -04:00
Rafael Miller
e10d4c7b0c
[fix/sdk] kwargs params ( #1490 )
...
* fix sdk kwargs params
* version
* Update __init__.py
---------
Co-authored-by: Nicolas <nicolascamara29@gmail.com>
2025-04-22 15:15:32 -04:00
Aparup Ganguly
6920b85ee1
Add examples/o4-mini web crawler
2025-04-22 22:07:39 +05:30
Aparup Ganguly
d05274ef0b
Add examples/o3 Web Crawler
2025-04-22 21:48:38 +05:30
Gergő Móricz
1a02ef56e6
fix(extract/fire-1): thinking tokens pricing
2025-04-22 01:00:00 +02:00
Gergő Móricz
f47d8779f5
fix(extract/oldExtract): do if for old/new extract
2025-04-22 00:53:17 +02:00
rafaelmmiller
1afc6258bd
Merge branch 'main' of https://github.com/mendableai/firecrawl
2025-04-19 12:53:43 -07:00
rafaelmmiller
a4323d8f23
fix:python-sdk
2025-04-19 12:53:37 -07:00
Gergő Móricz
df305c2a97
fix(): null-proof/nan-proof
2025-04-19 12:35:55 -07:00
Gergő Móricz
0e027fe430
fix(scrape): bill llm on fail
2025-04-19 02:56:18 -07:00
Gergő Móricz
b35462456e
billing for smart scrape
2025-04-19 02:06:30 -07:00