Nicolas
|
7b03ab36a7
|
Update openapi.json
|
2025-05-08 20:15:49 -03:00 |
|
Gergő Móricz
|
fa581995e6
|
feat(acuc): propagate team flags (FIR-1879) (#1522)
* feat(acuc): propagate team flags
* feat(flags): further functionality
|
2025-05-08 20:23:35 +02:00 |
|
Gergő Móricz
|
017a915ae8
|
Revert "fix(queue-worker, scrape): match billing logic and add billing for stealth proxies (#1521)"
This reverts commit e06c7cc234b9ee0c4bf112cc338e515f28674e11.
|
2025-05-08 18:34:13 +02:00 |
|
Gergő Móricz
|
e06c7cc234
|
fix(queue-worker, scrape): match billing logic and add billing for stealth proxies (#1521)
|
2025-05-08 10:51:38 -03:00 |
|
Gergő Móricz
|
0f32500149
|
fix(queue-jobs): never cc timeout jobs that are crawl-associated (makes no sense)
|
2025-05-08 12:54:23 +02:00 |
|
Gergő Móricz
|
7ad9a00ea8
|
fix(concurrency-limit): rework cc queue to work by time not priority (#1526)
|
2025-05-08 12:40:13 +02:00 |
|
Ademílson Tonato
|
ae12c326f0
|
refactor: maximum links limit for map endpoint from 5000 to 30000
|
2025-05-06 16:00:15 -04:00 |
|
Nicolas
|
17728379df
|
Revert "Nick: log su usage"
This reverts commit 6567ef81f6ca6dfd5a92eafb04f4eed8d37e0a6c.
|
2025-05-05 18:19:15 -03:00 |
|
Nicolas
|
6567ef81f6
|
Nick: log su usage
|
2025-05-05 18:00:34 -03:00 |
|
devin-ai-integration[bot]
|
411ecdf04b
|
Add crawl delay functionality with per-crawl concurrency limiting (FIR-249) (#1413)
* feat: Add crawl delay functionality with per-crawl concurrency limiting (FIR-249)
Co-Authored-By: mogery@sideguide.dev <mogery@sideguide.dev>
* fix: Skip crawl delay in test environment to fix CI tests
Co-Authored-By: mogery@sideguide.dev <mogery@sideguide.dev>
* refactor: Use crawlerOptions.delay instead of separate fields
Co-Authored-By: mogery@sideguide.dev <mogery@sideguide.dev>
* refactor: Rename crawlDelay to delay in type definitions for uniformity
Co-Authored-By: mogery@sideguide.dev <mogery@sideguide.dev>
* refactor: Fix crawl concurrency implementation based on PR feedback
Co-Authored-By: mogery@sideguide.dev <mogery@sideguide.dev>
* refactor: Simplify if/else structure in queue-jobs.ts based on PR feedback
Co-Authored-By: mogery@sideguide.dev <mogery@sideguide.dev>
* human fixes
* test: Add tests for crawl delay functionality
Co-Authored-By: mogery@sideguide.dev <mogery@sideguide.dev>
* test: Move crawl delay tests to existing crawl.test.ts file
Co-Authored-By: mogery@sideguide.dev <mogery@sideguide.dev>
* fix: Ensure sitemapped URLs are added to crawl concurrency queue and update crawl status endpoint
Co-Authored-By: mogery@sideguide.dev <mogery@sideguide.dev>
* dbg
* fix: Ensure jobs with crawl delay are properly added to BullMQ
Co-Authored-By: mogery@sideguide.dev <mogery@sideguide.dev>
* fix: Remove duplicate job addition to BullMQ for jobs with crawl delay
Co-Authored-By: mogery@sideguide.dev <mogery@sideguide.dev>
* fixes
* warning for devin
* test: Simplify crawl delay test as requested in PR feedback
Co-Authored-By: mogery@sideguide.dev <mogery@sideguide.dev>
* bump delay test timeout
* fix operation order
* bump further???
* fix: broken on self-host
* Update apps/api/src/services/queue-jobs.ts
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* fix: import
---------
Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-authored-by: mogery@sideguide.dev <mogery@sideguide.dev>
Co-authored-by: Gergő Móricz <mo.geryy@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
|
2025-05-02 17:20:57 +02:00 |
|
Nicolas
|
ac02ad3257
|
Revert "Nick: past_due changes"
This reverts commit 9449d7b020798b9368ed5e1bae3f5b2a32cb24c7.
|
2025-05-01 16:48:38 -03:00 |
|
Nicolas
|
9449d7b020
|
Nick: past_due changes
|
2025-05-01 16:42:20 -03:00 |
|
Nicolas
|
8b88c26a47
|
Update v1.ts (#1509)
|
2025-04-30 13:02:07 -03:00 |
|
Rafael Miller
|
eee613d1bc
|
[feat] Implement GCS storage option for scrape results across controllers an… (#1500)
* Implement GCS storage option for scrape results across controllers and update GCS document retrieval functionality
* done!
* Update gcs-jobs.ts
|
2025-04-29 15:15:44 -03:00 |
|
Gergő Móricz
|
8b82e11625
|
Decrease diff warn threshold further
|
2025-04-24 14:12:52 +02:00 |
|
Gergő Móricz
|
e4f9a92e98
|
adjust diff warn threshold
|
2025-04-23 23:54:46 +02:00 |
|
Gergő Móricz
|
0e7f2c8599
|
feat(diff): log if it takes a long time with params
|
2025-04-23 20:46:51 +02:00 |
|
Rafael Miller
|
37dabce1ed
|
[feat] added second scrapeURLWithFireEngine (#1494)
|
2025-04-23 20:36:36 +02:00 |
|
Gergő Móricz
|
9435c800c0
|
fix(api/tests/scrape): don't test scrape status endpoint on self-host env
|
2025-04-23 20:36:22 +02:00 |
|
Nicolas
|
2f6520bc38
|
Merge branch 'main' of https://github.com/mendableai/firecrawl
|
2025-04-23 14:19:26 -04:00 |
|
Nicolas
|
22f7efed35
|
NIck: rm scrape events
|
2025-04-23 14:19:25 -04:00 |
|
Nicolas
|
1c421f2d74
|
Nick: (#1492)
|
2025-04-22 21:42:37 -04:00 |
|
Nicolas
|
e532a96b0c
|
(fix/search) Search logs fix (#1491)
* Update search.ts
* Update search.ts
|
2025-04-22 17:12:10 -04:00 |
|
Gergő Móricz
|
1a02ef56e6
|
fix(extract/fire-1): thinking tokens pricing
|
2025-04-22 01:00:00 +02:00 |
|
Gergő Móricz
|
f47d8779f5
|
fix(extract/oldExtract): do if for old/new extract
|
2025-04-22 00:53:17 +02:00 |
|
Gergő Móricz
|
df305c2a97
|
fix(): null-proof/nan-proof
|
2025-04-19 12:35:55 -07:00 |
|
Gergő Móricz
|
0e027fe430
|
fix(scrape): bill llm on fail
|
2025-04-19 02:56:18 -07:00 |
|
Gergő Móricz
|
b35462456e
|
billing for smart scrape
|
2025-04-19 02:06:30 -07:00 |
|
Gergő Móricz
|
0a86d2738f
|
reenable
|
2025-04-19 01:57:58 -07:00 |
|
Gergő Móricz
|
324b4e2e1e
|
fix rounding
|
2025-04-19 01:45:36 -07:00 |
|
Gergő Móricz
|
6d2347b5f8
|
feat(llmExtract): add token tracking to all calls
|
2025-04-19 01:38:55 -07:00 |
|
Gergő Móricz
|
438ea19f16
|
feat(extract): add thinking tokens
|
2025-04-19 01:35:17 -07:00 |
|
Gergő Móricz
|
653a0207c3
|
fix(calculateCost): accuracy 2.5
|
2025-04-19 01:10:32 -07:00 |
|
Gergő Móricz
|
2b2e648d41
|
feat(smart-scrape): log failed costs
|
2025-04-19 01:07:00 -07:00 |
|
Gergő Móricz
|
3909b6641d
|
temp:
|
2025-04-19 00:19:25 -07:00 |
|
Gergő Móricz
|
5453bed58e
|
temp: put everything back
|
2025-04-19 00:13:20 -07:00 |
|
Gergő Móricz
|
f451b71308
|
fix acuc extract preview
|
2025-04-18 12:37:41 -07:00 |
|
Gergő Móricz
|
3caeaae074
|
fix(scrapeURL/llmExtract): fix schema-less JSON mode
|
2025-04-18 10:25:32 -07:00 |
|
Nicolas
|
c69acdffd8
|
Update v1.ts
|
2025-04-18 01:57:03 -07:00 |
|
Rafael Miller
|
29b36c5f9a
|
[python-SDK] improvs/async (#1337)
* improv/types-and-comments-descs
* async
* removed v0 in example
* tomkosms review
* refator: dry request and error handling
* fixed websocket params
* added origin to requests
* Update firecrawl.py
* Update firecrawl.py
* added agent options types
* Update firecrawl.py
* generic
* Update firecrawl.py
* scrape params commentary
* Update firecrawl.py
* Update firecrawl.py
* Update firecrawl.py
* Update firecrawl.py
* async scrape
* Update firecrawl.py
* Nick: new examples
* Nick: python sdk 2.0
* async functions
* Nick:
* Nick:
---------
Co-authored-by: Ademílson F. Tonato <ademilsonft@outlook.com>
Co-authored-by: Nicolas <nicolascamara29@gmail.com>
|
2025-04-18 01:32:55 -07:00 |
|
Gergő Móricz
|
33aece8e96
|
more cost calc
|
2025-04-17 14:00:48 -07:00 |
|
Gergő Móricz
|
9bea877eb1
|
feat(extract): cost limit (#1473)
|
2025-04-17 21:44:28 +02:00 |
|
Gergő Móricz
|
7df557e59c
|
feat(cost-tracking): add model tracking and more costs
|
2025-04-17 11:33:10 -07:00 |
|
Gergő Móricz
|
f844b329f1
|
remove agent preview rate limiter
|
2025-04-17 11:04:45 -07:00 |
|
Gergő Móricz
|
06770bc63f
|
fix(scrape/json): move back to 4o mini
|
2025-04-17 10:48:13 -07:00 |
|
Gergő Móricz
|
8546bcacc0
|
new cost tracking
|
2025-04-17 09:23:53 -07:00 |
|
Gergő Móricz
|
ba4df67de7
|
force 2.5
|
2025-04-16 16:53:04 -07:00 |
|
Gergő Móricz
|
6a93293fd0
|
feat(smart-scrape): use correct models for multi-entity assembly
|
2025-04-16 16:39:48 -07:00 |
|
Gergő Móricz
|
751c30f139
|
feat(extractSmartScrape): better pagination handling
|
2025-04-16 16:23:12 -07:00 |
|
Gergő Móricz
|
509e6e658c
|
feat(llmExtract): more logging
|
2025-04-16 15:40:07 -07:00 |
|