{"id":"9aedabc1-bb93-4aa6-b880-3858ecad00d7","shortId":"YtZVXp","kind":"skill","title":"Scrapy Distributed Crawler Framework","tagline":"Orchestrates large-scale web crawling using Scrapy with scrapy-redis for distributed job queuing. Integrates Splash for JavaScript rendering, stores results in MongoDB via scrapy-mongodb pipeline, and respects robots.txt with AutoThrottle.","description":"# Scrapy Distributed Crawler Framework\n\nOrchestrates large-scale web crawling using Scrapy with scrapy-redis for distributed job queuing. Integrates Splash for JavaScript rendering, stores results in MongoDB via scrapy-mongodb pipeline, and respects robots.txt with AutoThrottle.\n\n## Installation\n\nUse the upstream install or setup path that matches your environment:\n- pip install scrapy\n\nRequirements and caveats from upstream:\n- :alt: Supported Python Versions\n- It is cross-platform, and requires Python 3.10+. It is maintained by Zyte_\n\nBasic usage or getting-started notes:\n- .. code:: bash\n- And follow the documentation_ to learn how to use it.\n- .. _documentation: https://docs.scrapy.org/en/latest/\n\n- Source: https://github.com/scrapy/scrapy\n- Extracted from upstream docs: https://raw.githubusercontent.com/scrapy/scrapy/HEAD/README.rst\n\n## Source\n\n- [Agent Skill Exchange](https://agentskillexchange.com/skills/scrapy-distributed-crawler-framework/)","tags":["scrapy","distributed","crawler","framework","skills","agentskillexchange","agent-skills","ai-agents","ai-tools","awesome-list","claude-code","codex"],"capabilities":["skill","source-agentskillexchange","skill-scrapy-distributed-crawler-framework","topic-agent-skills","topic-ai-agents","topic-ai-tools","topic-awesome-list","topic-claude-code","topic-codex","topic-cursor","topic-llm","topic-mcp","topic-npx-skills","topic-openclaw","topic-skills-catalog"],"categories":["skills"],"synonyms":[],"warnings":[],"endpointUrl":"https://skills.sh/agentskillexchange/skills/scrapy-distributed-crawler-framework","protocol":"skill","transport":"skills-sh","auth":{"type":"none","details":{"cli":"npx skills add agentskillexchange/skills","source_repo":"https://github.com/agentskillexchange/skills","install_from":"skills.sh"}},"qualityScore":"0.454","qualityRationale":"deterministic score 0.45 from registry signals: · indexed on github topic:agent-skills · 8 github stars · SKILL.md body (962 chars)","verified":false,"liveness":"unknown","lastLivenessCheck":null,"agentReviews":{"count":0,"score_avg":null,"cost_usd_avg":null,"success_rate":null,"latency_p50_ms":null,"narrative_summary":null,"summary_updated_at":null},"enrichmentModel":"deterministic:skill-github:v1","enrichmentVersion":1,"enrichedAt":"2026-05-18T19:12:20.144Z","embedding":null,"createdAt":"2026-05-18T13:19:11.892Z","updatedAt":"2026-05-18T19:12:20.144Z","lastSeenAt":"2026-05-18T19:12:20.144Z","tsv":"'/en/latest/':139 '/scrapy/scrapy':143 '/scrapy/scrapy/head/readme.rst':150 '/skills/scrapy-distributed-crawler-framework/)':157 '3.10':111 'agent':152 'agentskillexchange.com':156 'agentskillexchange.com/skills/scrapy-distributed-crawler-framework/)':155 'alt':99 'autothrottl':39,78 'bash':125 'basic':117 'caveat':96 'code':124 'crawl':10,49 'crawler':3,42 'cross':106 'cross-platform':105 'distribut':2,18,41,57 'doc':147 'docs.scrapy.org':138 'docs.scrapy.org/en/latest/':137 'document':129,136 'environ':90 'exchang':154 'extract':144 'follow':127 'framework':4,43 'get':121 'getting-start':120 'github.com':142 'github.com/scrapy/scrapy':141 'instal':79,83,92 'integr':21,60 'javascript':24,63 'job':19,58 'larg':7,46 'large-scal':6,45 'learn':131 'maintain':114 'match':88 'mongodb':29,33,68,72 'note':123 'orchestr':5,44 'path':86 'pip':91 'pipelin':34,73 'platform':107 'python':101,110 'queu':20,59 'raw.githubusercontent.com':149 'raw.githubusercontent.com/scrapy/scrapy/head/readme.rst':148 'redi':16,55 'render':25,64 'requir':94,109 'respect':36,75 'result':27,66 'robots.txt':37,76 'scale':8,47 'scrapi':1,12,15,32,40,51,54,71,93 'scrapy-mongodb':31,70 'scrapy-redi':14,53 'setup':85 'skill':153 'skill-scrapy-distributed-crawler-framework' 'sourc':140,151 'source-agentskillexchange' 'splash':22,61 'start':122 'store':26,65 'support':100 'topic-agent-skills' 'topic-ai-agents' 'topic-ai-tools' 'topic-awesome-list' 'topic-claude-code' 'topic-codex' 'topic-cursor' 'topic-llm' 'topic-mcp' 'topic-npx-skills' 'topic-openclaw' 'topic-skills-catalog' 'upstream':82,98,146 'usag':118 'use':11,50,80,134 'version':102 'via':30,69 'web':9,48 'zyte':116","prices":[{"id":"1b4f9c56-992b-4ae3-8886-b43197933be4","listingId":"9aedabc1-bb93-4aa6-b880-3858ecad00d7","amountUsd":"0","unit":"free","nativeCurrency":null,"nativeAmount":null,"chain":null,"payTo":null,"paymentMethod":"skill-free","isPrimary":true,"details":{"org":"agentskillexchange","category":"skills","install_from":"skills.sh"},"createdAt":"2026-05-18T13:19:11.892Z"}],"sources":[{"listingId":"9aedabc1-bb93-4aa6-b880-3858ecad00d7","source":"github","sourceId":"agentskillexchange/skills/scrapy-distributed-crawler-framework","sourceUrl":"https://github.com/agentskillexchange/skills/tree/main/skills/scrapy-distributed-crawler-framework","isPrimary":false,"firstSeenAt":"2026-05-18T13:19:11.892Z","lastSeenAt":"2026-05-18T19:12:20.144Z"}],"details":{"listingId":"9aedabc1-bb93-4aa6-b880-3858ecad00d7","quickStartSnippet":null,"exampleRequest":null,"exampleResponse":null,"schema":null,"openapiUrl":null,"agentsTxtUrl":null,"citations":[],"useCases":[],"bestFor":[],"notFor":[],"kindDetails":{"org":"agentskillexchange","slug":"scrapy-distributed-crawler-framework","github":{"repo":"agentskillexchange/skills","stars":8,"topics":["agent-skills","ai-agents","ai-tools","awesome-list","claude-code","codex","cursor","llm","mcp","npx-skills","openclaw","skills-catalog"],"license":"mit","html_url":"https://github.com/agentskillexchange/skills","pushed_at":"2026-05-18T19:02:17Z","description":"The open catalog of AI agent skills — 2,000+ security-scanned skills for Claude Code, Cursor, Codex, and more.","skill_md_sha":"21f481e21b5ba50f54411384b9609743db1d4a92","skill_md_path":"skills/scrapy-distributed-crawler-framework/SKILL.md","default_branch":"main","skill_tree_url":"https://github.com/agentskillexchange/skills/tree/main/skills/scrapy-distributed-crawler-framework"},"layout":"multi","source":"github","category":"skills","frontmatter":{"name":"Scrapy Distributed Crawler Framework","description":"Orchestrates large-scale web crawling using Scrapy with scrapy-redis for distributed job queuing. Integrates Splash for JavaScript rendering, stores results in MongoDB via scrapy-mongodb pipeline, and respects robots.txt with AutoThrottle."},"skills_sh_url":"https://skills.sh/agentskillexchange/skills/scrapy-distributed-crawler-framework"},"updatedAt":"2026-05-18T19:12:20.144Z"}}