mirror of
https://git.mirrors.martin98.com/https://github.com/mendableai/firecrawl
synced 2025-07-31 23:42:02 +08:00
fix(self-host): update docs and dockerignore
This commit is contained in:
parent
e4504b3236
commit
10d9b65f96
53
SELF_HOST.md
53
SELF_HOST.md
@ -41,32 +41,45 @@ To start, we won't set up authentication or any optional subservices (pdf parsin
|
|||||||
`.env:`
|
`.env:`
|
||||||
```
|
```
|
||||||
# ===== Required ENVS ======
|
# ===== Required ENVS ======
|
||||||
NUM_WORKERS_PER_QUEUE=8
|
|
||||||
PORT=3002
|
PORT=3002
|
||||||
HOST=0.0.0.0
|
HOST=0.0.0.0
|
||||||
REDIS_URL=redis://redis:6379
|
|
||||||
REDIS_RATE_LIMIT_URL=redis://redis:6379
|
|
||||||
|
|
||||||
## To turn on DB authentication, you need to set up Supabase.
|
# To turn on DB authentication, you need to set up Supabase.
|
||||||
USE_DB_AUTHENTICATION=false
|
USE_DB_AUTHENTICATION=false
|
||||||
|
|
||||||
# ===== Optional ENVS ======
|
# ===== Optional ENVS ======
|
||||||
|
|
||||||
# Supabase Setup (used to support DB authentication, advanced logging, etc.)
|
# Supabase Setup (used to support DB authentication, advanced logging, etc.)
|
||||||
SUPABASE_ANON_TOKEN=
|
# SUPABASE_ANON_TOKEN=
|
||||||
SUPABASE_URL=
|
# SUPABASE_URL=
|
||||||
SUPABASE_SERVICE_TOKEN=
|
# SUPABASE_SERVICE_TOKEN=
|
||||||
|
|
||||||
# Other Optionals
|
# Use if you've set up authentication and want to test with a real API key
|
||||||
TEST_API_KEY= # use if you've set up authentication and want to test with a real API key
|
# TEST_API_KEY=
|
||||||
SCRAPING_BEE_API_KEY= # use if you'd like to use as a fallback scraper
|
|
||||||
OPENAI_API_KEY= # add for LLM-dependent features (e.g., image alt generation)
|
# You can add this to enable ScrapingBee as a fallback scraping engine.
|
||||||
BULL_AUTH_KEY= @
|
# SCRAPING_BEE_API_KEY=
|
||||||
PLAYWRIGHT_MICROSERVICE_URL= # set if you'd like to run a playwright fallback
|
|
||||||
LLAMAPARSE_API_KEY= #Set if you have a llamaparse key you'd like to use to parse pdfs
|
# Needed for JSON format on scrape and /extract endpoint
|
||||||
SLACK_WEBHOOK_URL= # set if you'd like to send slack server health status messages
|
# OPENAI_API_KEY=
|
||||||
POSTHOG_API_KEY= # set if you'd like to send posthog events like job logs
|
|
||||||
POSTHOG_HOST= # set if you'd like to send posthog events like job logs
|
# This key lets you access the queue admin panel. Change this if your deployment is publicly accessible.
|
||||||
|
BULL_AUTH_KEY=CHANGEME
|
||||||
|
|
||||||
|
# This is now autoconfigured by the docker-compose.yaml. You shouldn't need to set it.
|
||||||
|
# PLAYWRIGHT_MICROSERVICE_URL=http://playwright-service:3000/scrape
|
||||||
|
# REDIS_URL=redis://redis:6379
|
||||||
|
# REDIS_RATE_LIMIT_URL=redis://redis:6379
|
||||||
|
|
||||||
|
# Set if you have a llamaparse key you'd like to use to parse pdfs
|
||||||
|
# LLAMAPARSE_API_KEY=
|
||||||
|
|
||||||
|
# Set if you'd like to send server health status messages to Slack
|
||||||
|
# SLACK_WEBHOOK_URL=
|
||||||
|
|
||||||
|
# Set if you'd like to send posthog events like job logs
|
||||||
|
# POSTHOG_API_KEY=
|
||||||
|
# POSTHOG_HOST=
|
||||||
```
|
```
|
||||||
|
|
||||||
3. Build and run the Docker containers:
|
3. Build and run the Docker containers:
|
||||||
@ -78,9 +91,9 @@ POSTHOG_HOST= # set if you'd like to send posthog events like job logs
|
|||||||
|
|
||||||
This will run a local instance of Firecrawl which can be accessed at `http://localhost:3002`.
|
This will run a local instance of Firecrawl which can be accessed at `http://localhost:3002`.
|
||||||
|
|
||||||
You should be able to see the Bull Queue Manager UI on `http://localhost:3002/admin/@/queues`.
|
You should be able to see the Bull Queue Manager UI on `http://localhost:3002/admin/CHANGEME/queues`.
|
||||||
|
|
||||||
5. *(Optional)* Test the API
|
4. *(Optional)* Test the API
|
||||||
|
|
||||||
If you’d like to test the crawl endpoint, you can run this:
|
If you’d like to test the crawl endpoint, you can run this:
|
||||||
|
|
||||||
@ -88,7 +101,7 @@ If you’d like to test the crawl endpoint, you can run this:
|
|||||||
curl -X POST http://localhost:3002/v1/crawl \
|
curl -X POST http://localhost:3002/v1/crawl \
|
||||||
-H 'Content-Type: application/json' \
|
-H 'Content-Type: application/json' \
|
||||||
-d '{
|
-d '{
|
||||||
"url": "https://mendable.ai"
|
"url": "https://firecrawl.dev"
|
||||||
}'
|
}'
|
||||||
```
|
```
|
||||||
|
|
||||||
|
3
apps/playwright-service-ts/.dockerignore
Normal file
3
apps/playwright-service-ts/.dockerignore
Normal file
@ -0,0 +1,3 @@
|
|||||||
|
/node_modules/
|
||||||
|
/dist/
|
||||||
|
.env
|
Loading…
x
Reference in New Issue
Block a user