7 Commits

Author SHA1 Message Date
Yanlong Wang
12ba1bcfad
feat: serp endpoint (#1180)
* wip

* wip

* fix

* wip

* fix: add jitter to user cache

* cd

* fix

* fix

* fix: user cache age comparison

* fix: try to partition apiroll query

* bump: deps

* wip

* cd

* feat: fallback for serp

* fix

* cd

* fix

* fix

* serp: stop hiding expense

* serp: enable fallback by default
2025-04-02 14:58:13 +08:00
Yanlong Wang
b4b99f0096
cd: eu 2025-03-14 16:26:22 +08:00
yanlong.wang
ff7612414c
chore: use env variable to switch browser headless mode 2025-03-13 16:00:55 +08:00
Yanlong Wang
23a3b807c9
restructure: nolonger a firebase application (#1160)
* fix: fine allow redefining Function.prototype.toString

* wip

* wip

* wip

* wip

* wip

* wip

* wip

* fix: contentType encoding

* wip

* fix: error throwing

* wip

* fix

* wip

* fix

* fix

* fix: jsdom

* wip

* wip

* fix: links summary uniqueness

* wip

* wip

* robots-txt catch no robots.txt

* deps: remove puppeteer-extra-plugin-stealth

* fix: dont change waring type

* fix: curl

* fix: replace firebase-roundtrip-check with blackhole-detector

* fix: black hole detection

* sercher: black hole detecting

* fix: no h2c for searcher

* fix: bhd

* fix: search and crawl conflict

* fix: bhd

* fix

* fix: server script

* canvas: fixed avif issue

* logging: move some to debug

* fix

* fix: pptr declare ready only when page can be created without issues

* fix: bhd

* cd: cloud run deploy-health-check cannot complete pptr newPage

* cd: fix

* fix: curl body can be null

* fix

* fix

* fix: major fix regarding TC pdfs

* fix

* fix

* deps: fix civkit trie router issue

* fix

* boom: total restructure

* cd: fix docker ctx

* fix

* fix: switch to h2c

* cd: ensure http2
2025-03-08 00:46:52 +08:00
Han Xiao
b3fb4c5c57
feat: add image captioning (#6)
* Fix contentText assignment in CrawlerHost class

* fix: recover vscode configurations

* feat: add image captioning

* feat: add image captioning

* clean: vscode config

* chore: fix some ts warnings

* feat: auto alt text

* fix

* chore: improve prompt

* clean: unused config

* fix: failure condition

* fix: remove redundant code

* fix: catch parse error

* fix: catch parse error

---------

Co-authored-by: Yanlong Wang <yanlong.wang@naiver.org>
2024-04-15 20:51:31 -07:00
Han Xiao
8e241c7f5a chore: rename url2text to reader 2024-04-13 11:42:15 -07:00
yanlong.wang
89d6d49f06
wip 2024-04-10 19:32:07 +08:00