Merge branch 'mendableai:main' into main

This commit is contained in:
Stijn Smits 2024-10-07 12:18:26 +02:00 committed by GitHub
commit 34b5d4b619
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194

View File

@ -53,24 +53,11 @@ _Pst. hey, you, join our stargazers :)_
We provide an easy to use API with our hosted version. You can find the playground and documentation [here](https://firecrawl.dev/playground). You can also self host the backend if you'd like.
Check out the following resources to get started:
- [x] [API](https://docs.firecrawl.dev/api-reference/introduction)
- [x] [Python SDK](https://docs.firecrawl.dev/sdks/python)
- [x] [Node SDK](https://docs.firecrawl.dev/sdks/node)
- [x] [Go SDK](https://docs.firecrawl.dev/sdks/go)
- [x] [Rust SDK](https://docs.firecrawl.dev/sdks/rust)
- [x] [Langchain Integration 🦜🔗](https://python.langchain.com/docs/integrations/document_loaders/firecrawl/)
- [x] [Langchain JS Integration 🦜🔗](https://js.langchain.com/docs/integrations/document_loaders/web_loaders/firecrawl)
- [x] [Llama Index Integration 🦙](https://docs.llamaindex.ai/en/latest/examples/data_connectors/WebPageDemo/#using-firecrawl-reader)
- [x] [Dify Integration](https://dify.ai/blog/dify-ai-blog-integrated-with-firecrawl)
- [x] [Langflow Integration](https://docs.langflow.org/)
- [x] [Crew.ai Integration](https://docs.crewai.com/)
- [x] [Flowise AI Integration](https://docs.flowiseai.com/integrations/langchain/document-loaders/firecrawl)
- [x] [Composio Integration](https://composio.dev/tools/firecrawl/all)
- [x] [PraisonAI Integration](https://docs.praison.ai/firecrawl/)
- [x] [Zapier Integration](https://zapier.com/apps/firecrawl/integrations)
- [x] [Cargo Integration](https://docs.getcargo.io/integration/firecrawl)
- [x] [Pipedream Integration](https://pipedream.com/apps/firecrawl/)
- [x] [Pabbly Connect Integration](https://www.pabbly.com/connect/integrations/firecrawl/)
- [x] **API**: [Documentation](https://docs.firecrawl.dev/api-reference/introduction)
- [x] **SDKs**: [Python](https://docs.firecrawl.dev/sdks/python), [Node](https://docs.firecrawl.dev/sdks/node), [Go](https://docs.firecrawl.dev/sdks/go), [Rust](https://docs.firecrawl.dev/sdks/rust)
- [x] **LLM Frameworks**: [Langchain (python)](https://python.langchain.com/docs/integrations/document_loaders/firecrawl/), [Langchain (js)](https://js.langchain.com/docs/integrations/document_loaders/web_loaders/firecrawl), [Llama Index](https://docs.llamaindex.ai/en/latest/examples/data_connectors/WebPageDemo/#using-firecrawl-reader), [Crew.ai](https://docs.crewai.com/), [Composio](https://composio.dev/tools/firecrawl/all), [PraisonAI](https://docs.praison.ai/firecrawl/)
- [x] **Low-code Frameworks**: [Dify](https://dify.ai/blog/dify-ai-blog-integrated-with-firecrawl), [Langflow](https://docs.langflow.org/), [Flowise AI](https://docs.flowiseai.com/integrations/langchain/document-loaders/firecrawl), [Cargo](https://docs.getcargo.io/integration/firecrawl), [Pipedream](https://pipedream.com/apps/firecrawl/)
- [x] **Others**: [Zapier](https://zapier.com/apps/firecrawl/integrations), [Pabbly Connect](https://www.pabbly.com/connect/integrations/firecrawl/)
- [ ] Want an SDK or Integration? Let us know by opening an issue.
To run locally, refer to guide [here](https://github.com/mendableai/firecrawl/blob/main/CONTRIBUTING.md).
@ -79,6 +66,22 @@ To run locally, refer to guide [here](https://github.com/mendableai/firecrawl/bl
To use the API, you need to sign up on [Firecrawl](https://firecrawl.dev) and get an API key.
### Features
- [**Scrape**](#scraping): scrapes a URL and get its content in LLM-ready format (markdown, structured data via [LLM Extract](#llm-extraction-beta), screenshot, html)
- [**Crawl**](#crawling): scrapes all the URLs of a web page and return content in LLM-ready format
- [**Map**](#map-alpha): input a website and get all the website urls - extremly fast
### Powerful Capabilities
- **LLM-ready formats**: markdown, structured data, screenshot, HTML, links, metadata
- **The hard stuff**: proxies, anti-bot mechanisms, dynamic content (js-rendered), output parsing, orchestration
- **Customizability**: exclude tags, crawl behind auth walls with custom headers, max crawl depth, etc...
- **Media parsing**: pdfs, docx, images.
- **Reliability first**: designed to get the data you need - no matter how hard it is.
- **Actions**: click, scroll, input, wait and more before extracting data
You can find all of Firecrawl's capabilites and how to use them in our [documentation](https://docs.firecrawl.dev)
### Crawling
Used to crawl a URL and all accessible subpages. This submits a crawl job and returns a job ID to check the status of the crawl.
@ -548,4 +551,4 @@ Please note:
- The SDKs and some UI components are licensed under the MIT License. Refer to the LICENSE files in these specific directories for details.
- When using or contributing to this project, ensure you comply with the appropriate license terms for the specific component you are working with.
For more details on the licensing of specific components, please refer to the LICENSE files in the respective directories or contact the project maintainers.
For more details on the licensing of specific components, please refer to the LICENSE files in the respective directories or contact the project maintainers.