Gergő Móricz
|
0d3d18be65
|
feat(selfhost): deploy a playwright image (#1625)
|
2025-06-03 21:17:09 +02:00 |
|
Ademílson Tonato
|
e89ecc4e4a
|
feat: enhance metadata extraction by including 'itemprop' attribute in HTML (#1624)
|
2025-06-03 21:17:09 +02:00 |
|
Gergő Móricz
|
60525220a2
|
async saving to index
|
2025-06-03 21:16:13 +02:00 |
|
Gergő Móricz
|
d1b5e2ef47
|
revert
|
2025-06-03 17:02:27 +02:00 |
|
Gergő Móricz
|
1b3f037a26
|
wth
|
2025-06-03 16:55:39 +02:00 |
|
Gergő Móricz
|
ede7aec1f9
|
ok fixed
|
2025-06-03 16:47:32 +02:00 |
|
Gergő Móricz
|
4e5feca3dd
|
wow i'm an idiot
|
2025-06-03 16:41:28 +02:00 |
|
Gergő Móricz
|
71271cc4b8
|
try again
|
2025-06-03 16:38:40 +02:00 |
|
Gergő Móricz
|
c75fad5e79
|
improve fns
|
2025-06-03 16:33:14 +02:00 |
|
Gergő Móricz
|
6ba57306c3
|
asd
|
2025-06-03 16:26:43 +02:00 |
|
Gergő Móricz
|
37d1de09f3
|
workflow test run
|
2025-06-03 16:25:02 +02:00 |
|
Gergő Móricz
|
2fe35a4e3d
|
remove extraneous log
|
2025-06-03 16:24:02 +02:00 |
|
Gergő Móricz
|
39dd721781
|
clean up on map
|
2025-06-03 16:22:07 +02:00 |
|
Gergő Móricz
|
7426e54e6c
|
further fixes
|
2025-06-03 16:12:31 +02:00 |
|
Gergő Móricz
|
d7fef33224
|
Merge branch 'main' into mog/index
|
2025-06-03 16:09:57 +02:00 |
|
Gergő Móricz
|
da9a9b0d19
|
cleanup
|
2025-06-03 16:07:59 +02:00 |
|
Nicolas
|
e108ff3525
|
Update search.ts
|
2025-06-02 23:46:55 -03:00 |
|
Nicolas
|
9347de6a41
|
Update scrape.ts
|
2025-06-02 23:15:59 -03:00 |
|
Nicolas
|
86a9d3525b
|
Update queue-jobs.ts
|
2025-06-02 23:09:09 -03:00 |
|
Nicolas
|
cbc47305cc
|
Update search.ts
|
2025-06-02 23:09:02 -03:00 |
|
Nicolas
|
ce425d966f
|
Merge branch 'nsc/bypass-billing-internal'
|
2025-06-02 22:37:56 -03:00 |
|
Nicolas
|
8c661f5329
|
Update scrape.ts
|
2025-06-02 22:37:49 -03:00 |
|
Nicolas
|
dc8cc99b1d
|
Nick: bypass billing (#1622)
|
2025-06-02 21:57:28 -03:00 |
|
Nicolas
|
8967b31465
|
Nick: bypass billing
|
2025-06-02 21:51:46 -03:00 |
|
Nicolas
|
bf919ceb82
|
Nick: __searchPreviewToken
|
2025-06-02 21:16:34 -03:00 |
|
Nicolas
|
ef789ce8d7
|
Nick: __experimental
|
2025-06-02 19:58:56 -03:00 |
|
Gergő Móricz
|
72be73473f
|
feat(api/scrape): credits_billed column + handle billing for /scrape calls on worker side with stricter timeout enforcement (FIR-2162) (#1607)
* feat(api/scrape): stricten timeout and handle billing and logging on queue-worker
* fix: abortsignal pre-check
* fix: proper level
* add comment to clarify is_scrape
* reenable billing tests
* Revert "reenable billing tests"
This reverts commit 98236fdfa03dde8cecdd6b763fcf86810e468a28.
* oof
* fix searxng logging
---------
Co-authored-by: Nicolas <nicolascamara29@gmail.com>
|
2025-06-02 17:56:27 -03:00 |
|
Gergő Móricz
|
4167ec53eb
|
fix(scrapeURL): only allow disabling the adblock on playwright (FIR-2200) (#1616)
* fix(scrapeURL): only allow disabling the adblock on playwright
* feat(api/tests/scrape): re-enable ad blocking tests
|
2025-06-02 22:48:16 +02:00 |
|
Gergő Móricz
|
7a8be13220
|
remove indexes that are no longer used
|
2025-06-02 22:09:55 +02:00 |
|
Gergő Móricz
|
98ceda9bd5
|
feat(search): ignore concurrency limit for search (FIR-2187) (#1617)
* feat(search): ignore concurrency limit for search (temp)
* feat(search): only for low tier users for good DX
|
2025-06-02 17:07:44 -03:00 |
|
rafaelmmiller
|
014a99ef91
|
map benchmarks
|
2025-06-02 13:38:43 -03:00 |
|
Gergő Móricz
|
1396451d31
|
bump rust version pt.2
|
2025-06-02 18:10:14 +02:00 |
|
Gergő Móricz
|
07fb651a91
|
bump rust version
|
2025-06-02 18:09:12 +02:00 |
|
Supasin Liulak
|
6a76ccfacb
|
webhook param for crawl (#1609)
|
2025-06-02 18:08:32 +02:00 |
|
Gergő Móricz
|
8b864345e3
|
feat(api/test): index envs
|
2025-06-02 18:07:38 +02:00 |
|
Gergő Móricz
|
b9dc3e738e
|
feat(index): FIRECRAWL_INDEX_WRITE_ONLY
|
2025-06-02 18:00:47 +02:00 |
|
Gergő Móricz
|
b3eecdc81b
|
chore(js-sdk): bump
|
2025-06-02 17:57:47 +02:00 |
|
Gergő Móricz
|
297d783585
|
feat(js-sdk): dontStoreInCache
|
2025-06-02 17:52:46 +02:00 |
|
Gergő Móricz
|
b2aeb99dd4
|
disable cacheable lookup for self hosting tests
|
2025-06-02 17:45:24 +02:00 |
|
Gergő Móricz
|
dceca07837
|
fix(api/tests/scrape): fix index test to work with batching
|
2025-06-02 17:41:45 +02:00 |
|
Gergő Móricz
|
18a7462fea
|
feat(index): batch insert
|
2025-06-02 17:07:25 +02:00 |
|
Gergő Móricz
|
369a8f6050
|
feat(map): ignoreIndex
|
2025-06-01 11:51:36 +02:00 |
|
rafaelmmiller
|
22c7685239
|
feat/added benchmark for scrapes
|
2025-05-30 18:38:20 -03:00 |
|
Gergő Móricz
|
99d3db743d
|
feat(scrapeURL/index): behaviour on non-200 index entries
|
2025-05-30 15:14:16 +02:00 |
|
Gergő Móricz
|
8c250426b3
|
feat(queue-worker/kickoff): use index links to kickoff crawl
|
2025-05-30 14:16:49 +02:00 |
|
Gergő Móricz
|
96c753f9a9
|
feat: use url split columns
|
2025-05-30 13:56:28 +02:00 |
|
Nicolas
|
9297afd1ff
|
Nick: search
|
2025-05-29 17:00:13 -03:00 |
|
Gergő Móricz
|
a8e0482718
|
feat(search): bill for PDFs properly
|
2025-05-29 20:59:15 +02:00 |
|
Gergő Móricz
|
a2f41fb650
|
feat(api/server): wait 60s for GCE load balancer drain timeout
To minimize 502s.
|
2025-05-29 20:08:52 +02:00 |
|
Gergő Móricz
|
3ea221b093
|
fix(api/queue): tighten expiries on indexQueue jobs
|
2025-05-29 16:36:55 +02:00 |
|