fix(self-host): update docs and dockerignore

This commit is contained in:
Gergő Móricz 2025-02-20 07:57:06 +01:00
parent e4504b3236
commit 10d9b65f96
2 changed files with 36 additions and 20 deletions

View File

@ -41,32 +41,45 @@ To start, we won't set up authentication or any optional subservices (pdf parsin
`.env:`
```
# ===== Required ENVS ======
NUM_WORKERS_PER_QUEUE=8
PORT=3002
HOST=0.0.0.0
REDIS_URL=redis://redis:6379
REDIS_RATE_LIMIT_URL=redis://redis:6379
## To turn on DB authentication, you need to set up Supabase.
# To turn on DB authentication, you need to set up Supabase.
USE_DB_AUTHENTICATION=false
# ===== Optional ENVS ======
# Supabase Setup (used to support DB authentication, advanced logging, etc.)
SUPABASE_ANON_TOKEN=
SUPABASE_URL=
SUPABASE_SERVICE_TOKEN=
# SUPABASE_ANON_TOKEN=
# SUPABASE_URL=
# SUPABASE_SERVICE_TOKEN=
# Other Optionals
TEST_API_KEY= # use if you've set up authentication and want to test with a real API key
SCRAPING_BEE_API_KEY= # use if you'd like to use as a fallback scraper
OPENAI_API_KEY= # add for LLM-dependent features (e.g., image alt generation)
BULL_AUTH_KEY= @
PLAYWRIGHT_MICROSERVICE_URL= # set if you'd like to run a playwright fallback
LLAMAPARSE_API_KEY= #Set if you have a llamaparse key you'd like to use to parse pdfs
SLACK_WEBHOOK_URL= # set if you'd like to send slack server health status messages
POSTHOG_API_KEY= # set if you'd like to send posthog events like job logs
POSTHOG_HOST= # set if you'd like to send posthog events like job logs
# Use if you've set up authentication and want to test with a real API key
# TEST_API_KEY=
# You can add this to enable ScrapingBee as a fallback scraping engine.
# SCRAPING_BEE_API_KEY=
# Needed for JSON format on scrape and /extract endpoint
# OPENAI_API_KEY=
# This key lets you access the queue admin panel. Change this if your deployment is publicly accessible.
BULL_AUTH_KEY=CHANGEME
# This is now autoconfigured by the docker-compose.yaml. You shouldn't need to set it.
# PLAYWRIGHT_MICROSERVICE_URL=http://playwright-service:3000/scrape
# REDIS_URL=redis://redis:6379
# REDIS_RATE_LIMIT_URL=redis://redis:6379
# Set if you have a llamaparse key you'd like to use to parse pdfs
# LLAMAPARSE_API_KEY=
# Set if you'd like to send server health status messages to Slack
# SLACK_WEBHOOK_URL=
# Set if you'd like to send posthog events like job logs
# POSTHOG_API_KEY=
# POSTHOG_HOST=
```
3. Build and run the Docker containers:
@ -78,9 +91,9 @@ POSTHOG_HOST= # set if you'd like to send posthog events like job logs
This will run a local instance of Firecrawl which can be accessed at `http://localhost:3002`.
You should be able to see the Bull Queue Manager UI on `http://localhost:3002/admin/@/queues`.
You should be able to see the Bull Queue Manager UI on `http://localhost:3002/admin/CHANGEME/queues`.
5. *(Optional)* Test the API
4. *(Optional)* Test the API
If youd like to test the crawl endpoint, you can run this:
@ -88,7 +101,7 @@ If youd like to test the crawl endpoint, you can run this:
curl -X POST http://localhost:3002/v1/crawl \
-H 'Content-Type: application/json' \
-d '{
"url": "https://mendable.ai"
"url": "https://firecrawl.dev"
}'
```

View File

@ -0,0 +1,3 @@
/node_modules/
/dist/
.env