403 Commits

Author SHA1 Message Date
yanlong.wang
16cabcaf22
feat: opt out gfm/table 2024-11-21 18:26:21 +08:00
yanlong.wang
2b29679801
fix: img turndown rules 2024-11-20 17:29:28 +08:00
Yanlong Wang
4400bef95b
fix: tricks applied by puppeteer-extra-plugin-stealth 2024-11-18 16:43:40 +08:00
Yanlong Wang
1f4620deef
fix: img with srcset only 2024-11-18 16:37:42 +08:00
Yanlong Wang
6fa8ce309e
fix: poorly transformed detection 2024-11-16 13:59:31 +08:00
Yanlong Wang
706de20e5c
fix : deps 2024-11-15 11:00:39 +08:00
Yanlong Wang
59dcc2db94
feat: image retention config 2024-11-14 22:36:53 +08:00
Yanlong Wang
ccb4b8a49d
fix: potential invalid html 2024-11-13 00:39:07 +08:00
Yanlong Wang
be993c2cb1
fix: there may be invalid root doc 2024-11-13 00:32:48 +08:00
Yanlong Wang
68c4df2df3
fix: deps and bugs 2024-11-13 00:27:39 +08:00
yanlong.wang
7ae2545a30
chore: tweak deployment 2024-11-12 17:33:23 +08:00
yanlong.wang
e2a187d126
fix: crawling IP url 2024-11-11 15:30:48 +08:00
yanlong.wang
67d4a9f45a
fix: expect cookie encoding issue 2024-11-11 14:58:00 +08:00
yanlong.wang
53bc91c31a
feat: compound response 2024-11-11 12:40:40 +08:00
Yanlong Wang
22647a0617
feat: script injecting and tools 2024-11-08 14:19:54 +08:00
Yanlong Wang
bd629a836b
search now requires authentication 2024-11-01 14:15:03 +08:00
Yanlong Wang
5d865651b1
chore: bump deps 2024-11-01 09:20:23 +08:00
yanlong.wang
b10931b8ed
fix: turndown rules 2024-10-31 17:22:51 +08:00
yanlong.wang
340fb517d8
chore: add internal slack report 2024-10-30 17:42:06 +08:00
yanlong.wang
a488bb8921
fix: headers in overridden request 2024-10-29 15:20:58 +08:00
yanlong.wang
3303763345
fix: salvaging with google cache does not work anymore 2024-10-29 15:09:50 +08:00
yanlong.wang
ebc09003d1
fix: walk around locale setting bug 2024-10-29 15:09:20 +08:00
yanlong.wang
9242bb393a
fix: detect poorly transformed contents 2024-10-28 14:52:13 +08:00
yanlong.wang
a8793114bb
fix 2024-10-23 18:50:39 +08:00
yanlong.wang
e38c5514e1
fix 2024-10-23 18:12:43 +08:00
yanlong.wang
fb97410e99
fix: bump deps 2024-10-23 18:03:59 +08:00
yanlong.wang
d538726bdd
revert: domain cannot be un-doomed due to google function wrapper
acdfd93097/src/function_wrappers.ts (L109-L116)
2024-10-23 17:27:23 +08:00
yanlong.wang
fedffe3dd2
fix: force process quit on firebase issue 2024-10-23 16:08:02 +08:00
yanlong.wang
102a1686b0
feat: expand shadow dom 2024-10-23 14:58:46 +08:00
Yanlong Wang
00a1278385
chore: tweak deployment 2024-10-21 21:34:08 +08:00
yanlong.wang
d6ad9e75d6
chore: suspend data crunching 2024-10-21 12:07:14 +08:00
Yanlong Wang
cf32ab4fa7
bump: deps 2024-10-18 12:59:44 +08:00
Yanlong Wang
74eac2fc18
fix: remove link url escaping 2024-10-18 12:59:36 +08:00
yanlong.wang
a54816d12d
fix 2024-10-14 17:33:24 +08:00
yanlong.wang
6a97f0bfa6
fix: uri encoding 2024-10-14 17:27:29 +08:00
Zhaofeng Miao
f82504540b fix(adaptive-crawler): fix cache problem 2024-10-10 16:37:12 +08:00
Zhaofeng Miao
db432645c3 feat: change deployment machine type to improve cpu utilization 2024-10-10 11:21:42 +08:00
Zhaofeng Miao
b9124a2ec1 chore 2024-10-10 11:20:31 +08:00
Zhaofeng Miao
b3ca557f6e chore: security 2024-10-10 11:18:38 +08:00
Zhaofeng Miao
86d69eebd1 chore: fix security dependencies 2024-10-10 11:17:18 +08:00
Zhaofeng Miao
14322140ba docs: readme changelog 2024-10-10 10:34:25 +08:00
yanlong.wang
e9258af742
fix: pdf mode and google web cache 2024-10-09 17:47:53 +08:00
yanlong.wang
f6bbddcb48
fix: pageshot missing in cache 2024-10-09 15:07:30 +08:00
Zhaofeng Miao
a44d9a2d2a feat(adaptive-crawler): optimize relevance detection 2024-10-08 15:19:03 +08:00
Zhaofeng Miao
af282eec43 fix(adaptive-crawler): useSitemap should be rewritten in certain condition 2024-10-08 14:18:13 +08:00
yanlong.wang
339af19192
fix: request to unknown domain 2024-10-08 12:02:27 +08:00
Zhaofeng Miao
5a4b35e4b9 fix(adaptive-crawler): if no sitemap, use recursive instead 2024-10-08 11:50:50 +08:00
Yanlong Wang
ee29be58f1
fix: gfm strikethrough 2024-10-01 18:57:12 +08:00
Yanlong Wang
f0c3a9b70e
fix 2024-10-01 12:55:06 +08:00
Yanlong Wang
a66791d85f
fix 2024-09-27 13:30:29 +08:00