3153 Commits

Author SHA1 Message Date
Gergő Móricz
5457b71454
Update apps/api/src/services/queue-jobs.ts
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-04-04 22:09:16 +02:00
Gergő Móricz
6287db8492 fix: broken on self-host 2025-04-04 22:04:19 +02:00
Gergő Móricz
cb3008a8af bump further??? 2025-04-04 21:51:53 +02:00
Gergő Móricz
03f7fb19a1 fix operation order 2025-04-04 21:49:54 +02:00
Gergő Móricz
78e54a0457 bump delay test timeout 2025-04-04 21:40:35 +02:00
Gergő Móricz
65c85609d2
Merge branch 'main' into devin/1743787935-add-crawl-delay-FIR-249 2025-04-04 21:34:52 +02:00
Devin AI
debe3a733a test: Simplify crawl delay test as requested in PR feedback
Co-Authored-By: mogery@sideguide.dev <mogery@sideguide.dev>
2025-04-04 19:33:42 +00:00
Gergő Móricz
8fa202d146 warning for devin 2025-04-04 21:26:55 +02:00
Gergő Móricz
f808bcf97c fixes 2025-04-04 21:26:05 +02:00
Devin AI
9000ca550b fix: Remove duplicate job addition to BullMQ for jobs with crawl delay
Co-Authored-By: mogery@sideguide.dev <mogery@sideguide.dev>
2025-04-04 19:13:07 +00:00
Devin AI
458e252c6a fix: Ensure jobs with crawl delay are properly added to BullMQ
Co-Authored-By: mogery@sideguide.dev <mogery@sideguide.dev>
2025-04-04 19:03:46 +00:00
Gergő Móricz
de5fc547be dbg 2025-04-04 20:54:27 +02:00
Devin AI
c5f43bf8b3 fix: Ensure sitemapped URLs are added to crawl concurrency queue and update crawl status endpoint
Co-Authored-By: mogery@sideguide.dev <mogery@sideguide.dev>
2025-04-04 18:52:29 +00:00
Nicolas
41e094032f Update email_notification.ts 2025-04-04 14:36:41 -04:00
Nicolas
e1e39f8836 Nick: send notifications for crawl+batch scrape 2025-04-04 14:34:48 -04:00
Devin AI
7f0e722163 test: Move crawl delay tests to existing crawl.test.ts file
Co-Authored-By: mogery@sideguide.dev <mogery@sideguide.dev>
2025-04-04 18:11:40 +00:00
Devin AI
5bc822f66c test: Add tests for crawl delay functionality
Co-Authored-By: mogery@sideguide.dev <mogery@sideguide.dev>
2025-04-04 18:10:05 +00:00
mogery@sideguide.dev
26a59d123e human fixes 2025-04-04 18:05:29 +00:00
Devin AI
d702ad10af refactor: Simplify if/else structure in queue-jobs.ts based on PR feedback
Co-Authored-By: mogery@sideguide.dev <mogery@sideguide.dev>
2025-04-04 18:01:50 +00:00
Devin AI
d23497b955 refactor: Fix crawl concurrency implementation based on PR feedback
Co-Authored-By: mogery@sideguide.dev <mogery@sideguide.dev>
2025-04-04 17:55:33 +00:00
Devin AI
472e75b8dd refactor: Rename crawlDelay to delay in type definitions for uniformity
Co-Authored-By: mogery@sideguide.dev <mogery@sideguide.dev>
2025-04-04 17:47:47 +00:00
Devin AI
216b1047e3 refactor: Use crawlerOptions.delay instead of separate fields
Co-Authored-By: mogery@sideguide.dev <mogery@sideguide.dev>
2025-04-04 17:43:14 +00:00
Devin AI
19c427e840 fix: Skip crawl delay in test environment to fix CI tests
Co-Authored-By: mogery@sideguide.dev <mogery@sideguide.dev>
2025-04-04 17:40:45 +00:00
Devin AI
6b8f75ae91 feat: Add crawl delay functionality with per-crawl concurrency limiting (FIR-249)
Co-Authored-By: mogery@sideguide.dev <mogery@sideguide.dev>
2025-04-04 17:35:49 +00:00
Gergő Móricz
7128f83a7a
fix(js-sdk): isows import issues (FIR-1586) (FIR-1536) (#1411)
* attempt

* improvements

* kill isows -- there's been native websocket support in node since 21

* clean up the diff
2025-04-04 17:54:37 +02:00
Ademílson Tonato
b57d5f2c4d
Merge pull request #1409 from mendableai/feat/crawl-scrape-limit-notification
feat(queue-jobs): add function to determine job type and update notification logic for concurrency limits
v1.7.0
2025-04-03 18:29:00 +01:00
Ademílson F. Tonato
426151c9c9
feat(queue-jobs): add function to determine job type and update notification logic for concurrency limits 2025-04-03 17:02:51 +01:00
Gergő Móricz
8c1579df51 bump cc 2025-04-03 11:56:24 +02:00
Gergő Móricz
2e2c3d52ce feat: add swoogo classes to force include main tags 2025-04-03 09:57:19 +02:00
Gergő Móricz
24f5199359
compare format (FIR-1560) (#1405) 2025-04-02 19:52:43 +02:00
Gergő Móricz
b3b63486f1 cc manual 2025-04-02 19:27:13 +02:00
Ademílson Tonato
3300c6c598
Merge pull request #1404 from mendableai/fix/add-notification-type
feat(notification): add notification message for concurrency limit reached
2025-04-02 17:39:59 +01:00
Ademílson F. Tonato
b900f34b5a
feat(notification): add notification message for concurrency limit reached 2025-04-02 17:36:11 +01:00
rafaelmmiller
7216799ca0 revert mog changes 2025-04-02 10:45:11 -03:00
Ademílson Tonato
73a297d6c8
Merge pull request #1398 from mendableai/refactor/email-concurrency-limit-reached
feat(queue-jobs): update notification logic for concurrency limits and add parameter (jsdocs) to batchScrapeUrls
2025-04-02 11:18:18 +01:00
Ademílson F. Tonato
7468464552
feat(queue-jobs): implement conditional notification for concurrency limits based on team subscription status 2025-04-01 19:50:26 +01:00
Nicolas
ee211132c8 Nick: 2025-04-01 21:06:27 +04:00
Nicolas
c4255f4fdd Update auth.ts 2025-04-01 21:00:40 +04:00
Nicolas
b79b90fdd1 Update auth.ts 2025-04-01 20:53:43 +04:00
Ademílson F. Tonato
58e587d99e
feat(queue-jobs): update notification logic for concurrency limits and add parameter (jsdocs) to batchScrapeUrls 2025-03-31 13:27:36 +01:00
Gergő Móricz
e0a3c54967 new acuc 2025-03-30 17:32:24 +02:00
Gergő Móricz
b9dde3fc3d temp: move more to main instance 2025-03-29 18:18:55 +01:00
Gergő Móricz
4f0510e71d temp: switch over crawl fetches to main instance 2025-03-29 18:05:50 +01:00
Eric Ciarla
830d15f2f6
Merge pull request #1384 from aparupganguly/feature/v3-extractor
Add examples/ Deepseek V3 Company Researcher
2025-03-28 08:55:29 -04:00
Eric Ciarla
10ce20e01a
Merge pull request #1383 from aparupganguly/feature/v3-crawler
Add examples/deepseek-v3-crawler
2025-03-28 08:55:06 -04:00
Gergő Móricz
f0e0d3e2e3 fix(api): crawl origin tracking (FIR-1499) 2025-03-28 12:47:37 +01:00
Gergő Móricz
46048bc94d
feat(scrapeURL): return js returns from f-e (FIR-1535) (#1385)
* feat(scrapeURL): return js returns from f-e

* feat(js-sdk): handle new results
2025-03-28 12:42:25 +01:00
Aparup Ganguly
28928f0006 Add examples/DeepSeekv3 company researcher 2025-03-28 16:10:22 +05:30
Aparup Ganguly
da76524771 Add examples/deepseek-v3-crawler 2025-03-28 16:05:16 +05:30
Eric Ciarla
56d23cc6ac
Merge pull request #1380 from aparupganguly/feature/gemini-2.5-crawler
Add examples/gemini-2.5-pro crawler
2025-03-27 09:45:40 -04:00