fix(self-host): update docs and dockerignore

This commit is contained in:
Gergő Móricz 2025-02-20 07:57:06 +01:00
parent e4504b3236
commit 10d9b65f96
2 changed files with 36 additions and 20 deletions

View File

@ -41,32 +41,45 @@ To start, we won't set up authentication or any optional subservices (pdf parsin
`.env:` `.env:`
``` ```
# ===== Required ENVS ====== # ===== Required ENVS ======
NUM_WORKERS_PER_QUEUE=8
PORT=3002 PORT=3002
HOST=0.0.0.0 HOST=0.0.0.0
REDIS_URL=redis://redis:6379
REDIS_RATE_LIMIT_URL=redis://redis:6379
## To turn on DB authentication, you need to set up Supabase. # To turn on DB authentication, you need to set up Supabase.
USE_DB_AUTHENTICATION=false USE_DB_AUTHENTICATION=false
# ===== Optional ENVS ====== # ===== Optional ENVS ======
# Supabase Setup (used to support DB authentication, advanced logging, etc.) # Supabase Setup (used to support DB authentication, advanced logging, etc.)
SUPABASE_ANON_TOKEN= # SUPABASE_ANON_TOKEN=
SUPABASE_URL= # SUPABASE_URL=
SUPABASE_SERVICE_TOKEN= # SUPABASE_SERVICE_TOKEN=
# Other Optionals # Use if you've set up authentication and want to test with a real API key
TEST_API_KEY= # use if you've set up authentication and want to test with a real API key # TEST_API_KEY=
SCRAPING_BEE_API_KEY= # use if you'd like to use as a fallback scraper
OPENAI_API_KEY= # add for LLM-dependent features (e.g., image alt generation) # You can add this to enable ScrapingBee as a fallback scraping engine.
BULL_AUTH_KEY= @ # SCRAPING_BEE_API_KEY=
PLAYWRIGHT_MICROSERVICE_URL= # set if you'd like to run a playwright fallback
LLAMAPARSE_API_KEY= #Set if you have a llamaparse key you'd like to use to parse pdfs # Needed for JSON format on scrape and /extract endpoint
SLACK_WEBHOOK_URL= # set if you'd like to send slack server health status messages # OPENAI_API_KEY=
POSTHOG_API_KEY= # set if you'd like to send posthog events like job logs
POSTHOG_HOST= # set if you'd like to send posthog events like job logs # This key lets you access the queue admin panel. Change this if your deployment is publicly accessible.
BULL_AUTH_KEY=CHANGEME
# This is now autoconfigured by the docker-compose.yaml. You shouldn't need to set it.
# PLAYWRIGHT_MICROSERVICE_URL=http://playwright-service:3000/scrape
# REDIS_URL=redis://redis:6379
# REDIS_RATE_LIMIT_URL=redis://redis:6379
# Set if you have a llamaparse key you'd like to use to parse pdfs
# LLAMAPARSE_API_KEY=
# Set if you'd like to send server health status messages to Slack
# SLACK_WEBHOOK_URL=
# Set if you'd like to send posthog events like job logs
# POSTHOG_API_KEY=
# POSTHOG_HOST=
``` ```
3. Build and run the Docker containers: 3. Build and run the Docker containers:
@ -78,9 +91,9 @@ POSTHOG_HOST= # set if you'd like to send posthog events like job logs
This will run a local instance of Firecrawl which can be accessed at `http://localhost:3002`. This will run a local instance of Firecrawl which can be accessed at `http://localhost:3002`.
You should be able to see the Bull Queue Manager UI on `http://localhost:3002/admin/@/queues`. You should be able to see the Bull Queue Manager UI on `http://localhost:3002/admin/CHANGEME/queues`.
5. *(Optional)* Test the API 4. *(Optional)* Test the API
If youd like to test the crawl endpoint, you can run this: If youd like to test the crawl endpoint, you can run this:
@ -88,7 +101,7 @@ If youd like to test the crawl endpoint, you can run this:
curl -X POST http://localhost:3002/v1/crawl \ curl -X POST http://localhost:3002/v1/crawl \
-H 'Content-Type: application/json' \ -H 'Content-Type: application/json' \
-d '{ -d '{
"url": "https://mendable.ai" "url": "https://firecrawl.dev"
}' }'
``` ```

View File

@ -0,0 +1,3 @@
/node_modules/
/dist/
.env