AnyCrawl
Integrates with the AnyCrawl API to provide web scraping and crawling capabilities with configurable depth limits, mu...
What it does
Integrates with the AnyCrawl API to provide web scraping and crawling capabilities with configurable depth limits, multiple scraping engines, and structured data extraction in various formats including markdown and JSON.
This MCP server provides web scraping and crawling capabilities through integration with the AnyCrawl API, offering tools for single-page scraping, multi-page crawling with configurable depth and limits, and web search with optional content extraction. Built with TypeScript using FastMCP and featuring dual deployment modes (STDIO for local use and HTTP/SSE for cloud services), it supports multiple scraping engines (Cheerio, Playwright, Puppeteer), various output formats including markdown and JSON with AI-powered structured extraction, and comprehensive crawling strategies with path filtering and async job management. The implementation includes Docker deployment with Nginx proxy for API key-based routing, extensive test coverage with both mocked and real API integration tests, and robust error handling with detailed logging, making it valuable for AI assistants that need reliable web content extraction, research workflows requiring multi-page site analysis, and applications building knowledge bases from web sources.
Capabilities
Server
Quality
deterministic score 0.61 from registry signals: · indexed on pulsemcp · has source repo · 6 github stars · registry-generated description present