Crawlee Web Crawling and Scraping Library by Apify
Crawlee is Apify’s open source crawling and scraping framework for Node.js. It unifies HTTP scraping and browser automation, adds queues, storage, retries, proxies, and lets developers switch between Playwright, Puppeteer, Cheerio, and JSDOM without rebuilding the whole pipeline.
What it does
Crawlee Web Crawling and Scraping Library by Apify
Crawlee is Apify’s open source crawling and scraping framework for Node.js. It unifies HTTP scraping and browser automation, adds queues, storage, retries, proxies, and lets developers switch between Playwright, Puppeteer, Cheerio, and JSDOM without rebuilding the whole pipeline.
Installation
Use the upstream install or setup path that matches your environment:
- npx crawlee create my-crawler
- npm start
- npm install crawlee playwright
- npm install crawlee@next
Requirements and caveats from upstream:
- Do you prefer 🐍 Python instead of JavaScript? 👉 Checkout Crawlee for Python 👈.
- Crawlee requires Node.js 16 or higher.
Basic usage or getting-started notes:
-
We recommend visiting the Introduction tutorial in Crawlee documentation for more information.
-
With Crawlee CLI
-
The fastest way to try Crawlee out is to use the Crawlee CLI and choose the Getting started example. The CLI will install all the necessary dependencies and add boilerplate code for you to play with.
-
Source: https://github.com/apify/crawlee
-
Extracted from upstream docs: https://raw.githubusercontent.com/apify/crawlee/HEAD/README.md
Documentation
Source
Capabilities
Install
Quality
deterministic score 0.45 from registry signals: · indexed on github topic:agent-skills · 8 github stars · SKILL.md body (1,461 chars)