Jakob Stadlhuber
895e80caa4
Add liveness and readiness probes to Kubernetes configs
...
Introduced liveness and readiness probes for the Playwright service, API, and worker components. This ensures that Kubernetes can better manage the health and availability of these services by periodically checking their endpoints. This enhancement will improve the robustness and reliability of the deployed applications.
2024-07-24 19:00:23 +02:00
Jakob Stadlhuber
be9e7f9edf
Update Kubernetes configs for playwright-service, api, and worker
...
Added new ConfigMap for playwright-service and adjusted existing references.
Applied imagePullPolicy: Always to ensure all images are updated promptly.
Updated README to include --no-cache for Docker build instructions.
2024-07-24 18:54:16 +02:00
Gergo Moricz
60c74357df
feat(ScrapeEvents): log queue events
2024-07-24 18:44:14 +02:00
Jakob Stadlhuber
497aa5d25e
Update Kubernetes configs for playwright-service, api, and worker
...
Added new ConfigMap for playwright-service and adjusted existing references.
Applied imagePullPolicy: Always to ensure all images are updated promptly.
Updated README to include --no-cache for Docker build instructions.
2024-07-24 17:55:45 +02:00
rafaelsideguide
4eca6bd301
fix/check-for-auth-on-scrape-log
2024-07-24 12:54:14 -03:00
Nicolas
4ead89f983
Merge pull request #453 from mendableai/nsc/notion-fix
...
Notion Website Fixes
2024-07-24 11:40:19 -04:00
Nicolas
3a1b8a9797
Update website_params.ts
2024-07-24 11:04:47 -04:00
Nicolas
8b48ec8d30
Update website_params.ts
2024-07-24 11:02:20 -04:00
Gergo Moricz
4d35ad073c
feat(monitoring/scrape): include url, worker, response_size
2024-07-24 16:43:39 +02:00
Gergo Moricz
64bcedeefc
fix(monitoring): bad success check on scrape
2024-07-24 16:21:59 +02:00
Gergo Moricz
d57dbbd0c6
fix: add jobId for scrape
2024-07-24 15:18:12 +02:00
Gergo Moricz
71072fef3b
fix(scrape-events): bad logic
2024-07-24 14:46:41 +02:00
Gergo Moricz
7cd9bf92e3
feat: scrape event logging to DB
2024-07-24 14:31:25 +02:00
Rafael Miller
5e728c1a4d
Update apps/api/src/scraper/WebScraper/crawler.ts
...
no need for regex
Co-authored-by: Gergő Móricz <mo.geryy@gmail.com>
2024-07-24 08:33:00 -03:00
Eric Ciarla
1b7a00624d
Delete old comp
2024-07-23 21:51:08 -04:00
Eric Ciarla
565bc09439
Basic react app
2024-07-23 21:48:11 -04:00
rafaelsideguide
6208ecdbc0
added logger
2024-07-23 17:30:46 -03:00
Eric Ciarla
a0d89169ed
init
2024-07-23 15:48:12 -04:00
Nicolas
f0b07b509b
Update index.ts
2024-07-23 15:15:56 -04:00
rafaelsideguide
a684bd3c5d
added regex for links in sitemap
2024-07-23 09:07:23 -03:00
Nicolas
252bc09ee2
Merge pull request #447 from mendableai/nsc/speed-improvements
...
/scrape should now be 600ms-900ms faster
2024-07-22 19:18:24 -04:00
Nicolas
ac692ef09c
Update CONTRIBUTING.md
2024-07-22 19:17:53 -04:00
Nicolas
30e706b43f
Update scrape.ts
2024-07-22 19:15:24 -04:00
Nicolas
8916fec66c
Update index.ts
2024-07-22 19:14:53 -04:00
Nicolas
575ddc9e6e
Update scrape.ts
2024-07-22 19:12:51 -04:00
Nicolas
e31a5007d5
Nick: speed improvements
2024-07-22 18:30:58 -04:00
Nicolas
1bc36e1a56
Update fly-direct.yml
2024-07-22 14:12:55 -04:00
Nicolas
b229fbebd8
Update scrape_log.ts
2024-07-19 12:53:26 -04:00
rafaelsideguide
5c02dbe20c
fix(isFile): added .tiff extension
2024-07-18 17:07:21 -03:00
Gergo Moricz
f0e95ce399
fix(WebCrawler): filter out file URLs when taking URLs from sitemap
2024-07-18 21:49:37 +02:00
Gergo Moricz
95c6c63b85
fix(fly): raise heap limit to 4G per process
2024-07-18 20:56:54 +02:00
Nicolas
5f14f4f788
Update blocklist.ts
2024-07-18 14:20:19 -04:00
Nicolas
6161b83890
Update scrape_log.ts
2024-07-18 14:17:08 -04:00
Nicolas
c402c85346
Merge branch 'main' of https://github.com/mendableai/firecrawl
2024-07-18 14:16:51 -04:00
Nicolas
2dd7398aad
Update scrape_log.ts
2024-07-18 14:16:46 -04:00
Gergo Moricz
791e6b2047
fix action
2024-07-18 19:59:33 +02:00
Nicolas
f10f3f886b
Merge pull request #410 from mendableai/feat/fire-engine-chrome-cdp
...
Support chrome-cdp and restructure sitemap fire-engine support.
2024-07-18 13:52:08 -04:00
Nicolas
9a1a227797
Update crawl-cancel.ts
2024-07-18 13:49:51 -04:00
Nicolas
11768571ed
Update crawl-cancel.ts
2024-07-18 13:43:03 -04:00
Nicolas
ce804d3c20
Update crawl-cancel.ts
2024-07-18 13:40:24 -04:00
Nicolas
d338b05446
Merge pull request #436 from mendableai/mog/fix-infinite-regex
...
fix(WebScraper): infinite regex leading to fly.io instance hangs
2024-07-18 13:32:44 -04:00
Nicolas
d2de01d342
Nick: fixes
2024-07-18 13:19:44 -04:00
Gergo Moricz
0b8047c7a0
fix(WebScraper): infinite regex leading to fly.io instance hangs
2024-07-18 19:13:43 +02:00
Nicolas
f11137352c
Merge branch 'main' into feat/fire-engine-chrome-cdp
2024-07-18 12:48:42 -04:00
Nicolas
6d1d46a987
Merge pull request #433 from mendableai/mog/js-sdk-tests-fix
...
fix(js-sdk): transform tests with ts-jest and configure node
2024-07-18 12:40:59 -04:00
Nicolas
01b5e8fc73
Merge pull request #429 from mendableai/mog/fix-job-stuck-2
...
Fix queue stuck bug via lock settings changes
2024-07-18 12:39:21 -04:00
Nicolas
b134ba92bc
Merge pull request #427 from mendableai/docs/update-docs
...
[Docs] Updating docs
2024-07-18 11:49:08 -04:00
rafaelsideguide
f13ef02a08
Update openapi.json
2024-07-18 10:34:03 -03:00
Gergo Moricz
a23b125471
fix(js-sdk): transform tests with ts-jest and configure node
2024-07-18 14:20:51 +02:00
Nicolas
12ec519f9b
Update docker-compose.yaml
2024-07-17 22:44:23 -04:00