Crawlee Web Crawling and Scraping SDK
Crawlee is Apify's open source web crawling and scraping library for Node.js. It combines request queueing, browser automation, proxy support, and storage primitives so agents can build reliable Playwright, Puppeteer, Cheerio, or HTTP crawlers from one toolkit.
What it does
Crawlee Web Crawling and Scraping SDK
Crawlee is Apify's open source web crawling and scraping library for Node.js. It combines request queueing, browser automation, proxy support, and storage primitives so agents can build reliable Playwright, Puppeteer, Cheerio, or HTTP crawlers from one toolkit.
Prerequisites
node.js, npm, bun, python, docker, java
Installation
Use the upstream install or setup path that matches your environment:
- npx crawlee create my-crawler
- npm start
- npm install crawlee playwright
- npm install crawlee@next
Requirements and caveats from upstream:
- Do you prefer ๐ Python instead of JavaScript? ๐ Checkout Crawlee for Python ๐.
- Crawlee requires Node.js 16 or higher.
Basic usage or getting-started notes:
-
We recommend visiting the Introduction tutorial in Crawlee documentation for more information.
-
With Crawlee CLI
-
The fastest way to try Crawlee out is to use the Crawlee CLI and choose the Getting started example. The CLI will install all the necessary dependencies and add boilerplate code for you to play with.
-
Source: https://github.com/apify/crawlee
-
Extracted from upstream docs: https://raw.githubusercontent.com/apify/crawlee/HEAD/README.md
Documentation
Source
Capabilities
Install
Quality
deterministic score 0.45 from registry signals: ยท indexed on github topic:agent-skills ยท 8 github stars ยท SKILL.md body (1,503 chars)