Aaron Ji
|
cdace054d8
|
chore: cleanup
|
2025-03-10 18:08:49 +08:00 |
|
Aaron Ji
|
a15681cba5
|
feat: support getting update time of websites for s.jina.ai
|
2025-03-10 18:03:06 +08:00 |
|
yanlong.wang
|
5bbd75a6d6
|
fix
|
2025-03-10 17:52:19 +08:00 |
|
yanlong.wang
|
f0560c6949
|
fix: caching condition
|
2025-03-10 17:29:17 +08:00 |
|
yanlong.wang
|
0da71cad34
|
fix: robots-txt not loaded error conditions
|
2025-03-10 17:19:25 +08:00 |
|
yanlong.wang
|
4e5abd345e
|
docs: minor fix
|
2025-03-10 17:02:16 +08:00 |
|
yanlong.wang
|
ce11f44b92
|
docs: be neutral on brand
|
2025-03-10 16:58:20 +08:00 |
|
yanlong.wang
|
d71c89a79c
|
fix: block suspicious requests before sideload
|
2025-03-10 16:46:26 +08:00 |
|
yanlong.wang
|
3b3a0265df
|
feat: control concurrent request per page instead of server bucket
|
2025-03-10 16:45:56 +08:00 |
|
yanlong.wang
|
c064fcf77e
|
fix: unhandledRejection log level
|
2025-03-10 15:31:38 +08:00 |
|
yanlong.wang
|
a9855dcd3b
|
chore: prefer ctx.URL
|
2025-03-10 15:25:58 +08:00 |
|
yanlong.wang
|
cf01a2c504
|
chore: add comments to clarify
|
2025-03-10 15:17:42 +08:00 |
|
yanlong.wang
|
531c660a5d
|
fix: missing url query param
|
2025-03-10 15:15:59 +08:00 |
|
yanlong.wang
|
df127d0207
|
fix: finalizer and unhandled promise rejection
|
2025-03-10 15:05:55 +08:00 |
|
yanlong.wang
|
eba1f9c0ec
|
fix: provide our onw robots.txt
|
2025-03-10 14:18:31 +08:00 |
|
yanlong.wang
|
0d6cf2b1d1
|
fix: robots-txt location
|
2025-03-10 14:07:52 +08:00 |
|
yanlong.wang
|
101cb19dde
|
fix: robots-txt cache location
|
2025-03-10 14:07:19 +08:00 |
|
yanlong.wang
|
a7a41250d4
|
fix: curl redirections
|
2025-03-10 13:46:18 +08:00 |
|
Aaron Ji
|
8ec8123ff4
|
chore: fix search result amount (#1163)
|
2025-03-10 13:38:16 +08:00 |
|
yanlong.wang
|
8a8ae10919
|
fix: curl error category
|
2025-03-10 12:24:54 +08:00 |
|
yanlong.wang
|
5f6cfdf280
|
deps: cleanup
|
2025-03-10 12:21:23 +08:00 |
|
Yanlong Wang
|
19a0bbe924
|
fix: bad snapshot in sideload should not throw directly
|
2025-03-10 09:48:22 +08:00 |
|
Yanlong Wang
|
ead906e603
|
fix: runtime NODE_COMPILE_CACHE dir
|
2025-03-10 09:32:49 +08:00 |
|
Yanlong Wang
|
6e78e38e95
|
feat: leveraging NODE_COMPILE_CACHE (#1162)
* wip: try to leverage NODE_COMPILE_CACHE
* fix
* fix
* fix
* fix
* fix: black hole detector
* bhd: also tracking curl requests
|
2025-03-10 09:23:25 +08:00 |
|
Yanlong Wang
|
d0e20cc086
|
fix: several crash cases
|
2025-03-09 12:01:52 +08:00 |
|
Yanlong Wang
|
6b9e14de62
|
feat: md options pass though to turndown
|
2025-03-09 10:31:39 +08:00 |
|
Yanlong Wang
|
2720b69e60
|
deps: bump
|
2025-03-08 23:59:02 +08:00 |
|
Yanlong Wang
|
3020d589b6
|
fix: catch jsdom errors
|
2025-03-08 23:17:53 +08:00 |
|
Yanlong Wang
|
da48d0e4a7
|
deps: bump
|
2025-03-08 22:27:56 +08:00 |
|
Yanlong Wang
|
4ca627c0c5
|
fix: guard invalid domain names
|
2025-03-08 22:21:25 +08:00 |
|
Yanlong Wang
|
4830ff5fda
|
fix: potential fix for firestore grpc connection reset
|
2025-03-08 21:37:55 +08:00 |
|
Yanlong Wang
|
fd328cbcc2
|
fix
|
2025-03-08 20:52:35 +08:00 |
|
Yanlong Wang
|
8456fcecbd
|
fix: somehow side-loading chromewebstore would 100% crash the browser
|
2025-03-08 20:44:02 +08:00 |
|
Yanlong Wang
|
440ff4d729
|
fix: expect failure while loading pdf
|
2025-03-08 20:25:18 +08:00 |
|
Yanlong Wang
|
4bc6394692
|
fix: potential invalid pdf issue
|
2025-03-08 20:19:50 +08:00 |
|
Yanlong Wang
|
4ab28fe971
|
deps: bump
|
2025-03-08 20:04:24 +08:00 |
|
Yanlong Wang
|
66db31788e
|
cleanup: use local project code as much as possible
|
2025-03-08 19:32:30 +08:00 |
|
Yanlong Wang
|
512f225692
|
fix: sideload redirections
|
2025-03-08 18:59:36 +08:00 |
|
Yanlong Wang
|
c19ca2147c
|
fix: bug in pptr injections
|
2025-03-08 18:43:39 +08:00 |
|
Yanlong Wang
|
e551695d17
|
fix: fail early on special cookie redirects
|
2025-03-08 18:43:10 +08:00 |
|
Yanlong Wang
|
26f6202f79
|
fix: curl cookie mimicking
|
2025-03-08 18:14:43 +08:00 |
|
Yanlong Wang
|
89e5dbbe9c
|
fix: curl cookie behavior
|
2025-03-08 18:01:38 +08:00 |
|
Yanlong Wang
|
3b1978fd1d
|
fix: implement DNT in alt-gen and pdf-extract
|
2025-03-08 17:52:49 +08:00 |
|
Yanlong Wang
|
1a2754c674
|
fix: sideLoad header detection
|
2025-03-08 17:46:09 +08:00 |
|
Yanlong Wang
|
63a2e15f4d
|
fix: curl redirection location
|
2025-03-08 17:20:38 +08:00 |
|
Yanlong Wang
|
fb43578fdd
|
fix: curl implicit redirect
|
2025-03-08 17:18:53 +08:00 |
|
Yanlong Wang
|
8597daa96b
|
fix: side load context bridging
|
2025-03-08 16:49:14 +08:00 |
|
Yanlong Wang
|
e92ff33ad0
|
fix
|
2025-03-08 15:49:22 +08:00 |
|
Yanlong Wang
|
b674d26f76
|
fix: clean HTML timer
|
2025-03-08 13:33:10 +08:00 |
|
Yanlong Wang
|
4e299bf8e2
|
fix: remove tailwind classes instead of the opposite
|
2025-03-08 13:30:06 +08:00 |
|