Scraper
Extracts content from websites through four core tools for HTML scraping, markdown conversion, text extraction, and l...
What it does
Extracts content from websites through four core tools for HTML scraping, markdown conversion, text extraction, and link discovery with batch processing, caching, proxy support, and CSS selector filtering for reliable web content extraction and research automation.
A web scraping MCP server built by Carrotly AI that provides four core tools for extracting content from websites: raw HTML scraping, markdown conversion, plain text extraction, and link discovery. Built with FastMCP and featuring an extensible provider architecture (currently using requests with exponential backoff retry logic), it supports both single URL and batch operations with configurable concurrency, intelligent disk-based caching with TTL management, CSS selector filtering for targeted content extraction, and optional ScrapeOps proxy integration for JavaScript rendering and anti-bot bypass. The implementation includes a web dashboard for monitoring and testing, comprehensive error handling with graceful degradation, and Docker deployment support, making it useful for research automation, content analysis workflows, and building web scraping pipelines with built-in resilience and caching.
Capabilities
Server
Quality
deterministic score 0.56 from registry signals: · indexed on pulsemcp · has source repo · 6 github stars · registry-generated description present