Skillquality 0.45
seo-firecrawl
Ad-hoc web scraping, site mapping, and full-site crawling via Firecrawl MCP. Returns raw HTML, parsed metadata (og:*, twitter:*, JSON-LD, canonical, robots), JS-rendered DOM, and screenshots that WebFetch cannot. Distinct from the SE Ranking skills (which give keyword/traffic/SER
What it does
Firecrawl Orchestrator
A direct interface to Firecrawl MCP for tasks that fall outside the data-driven SE Ranking skills. Use when:
- You need raw HTML,
<head>metadata, JSON-LD, or post-JS DOM that WebFetch's markdown conversion strips. - You need a list of all URLs on a domain without pulling each one.
- You need to crawl a site and audit each page's metadata.
- You need to search within a known domain.
- A higher-level skill (
seo-page,seo-schema,seo-content-audit, etc.) called you as a sub-step.
Prerequisites
- Required: the
firecrawl-mcpMCP server. Ifmcp__firecrawl-mcp__firecrawl_scrapeis unavailable, abort with the install command —bash extensions/firecrawl/install.shfrom this plugin repo, plus the firecrawl.dev signup URL (free tier 500 credits/month). Don't attempt fallbacks; this skill exists for the cases WebFetch can't cover. - User provides: a target URL or domain, plus optionally a mode (
scrape/map/crawl/search). If mode unspecified, infer from input shape (single URL →scrape, single domain →map).
Process
- Preflight. Confirm
firecrawl-mcpis connected. If not, surface the install command and stop. - Mode selection. Resolve user intent into one of:
scrape— single URL, full data (default if user supplies one URL).map— single domain, list of URLs only (cheap reconnaissance).crawl— single domain, fetch each discovered page (expensive; require explicit confirm).search— query within a domain.
- Cost estimation + confirmation.
scrape(1 credit),map(~0.5 credit per discovered URL — estimate using amapfirst if scope unclear),crawl(1 credit per page crawled),search(1 credit per result returned).- For
crawland formapof >50 expected URLs, surface the estimate and require explicit go-ahead before calling. - Always read remaining credits implicitly via Firecrawl's response metadata (
creditsUsed/creditsRemaininginmetadata).
- Execute. Call the matching
mcp__firecrawl-mcp__firecrawl_*tool.scrape: passformats: ["markdown", "html"]by default (markdown for prose, html for<head>+ JSON-LD). Addformats: ["screenshot"]only if the deliverable visibly uses one. SPAs: passwaitFor: 2000(or a CSS selector) so the JS-rendered DOM is captured. DefaultonlyMainContent: trueto drop nav/footer noise — override only on explicit request.map: defaultlimit: 500(hard cap). PassexcludePaths: ["/admin/*", "/api/*", "/wp-admin/*", "/feed/*"]as a sane default.crawl: defaultlimit: 50(default cap), hard caplimit: 200. Always passexcludePathsto prune. Pollfirecrawl_check_crawl_statusif the job returns asynchronously.search: defaultlimit: 20.
- Parse + structure output. Don't dump the raw API response. Per-mode:
scrape→RAW.md(markdown body),META.md(og / twitter / canonical / robots / headers + parsed JSON-LD@typelist with hashes),links.csv, optionalscreenshot.png.map→URLS.mdwith pattern-grouped list (e.g.,/blog/* — 128 (37%),/products/* — 84 (24%)), plusurls.csv.crawl→ folder per page underpages/{slugified-url}/withRAW.md+META.md, plus a top-levelINDEX.mdsummarising every page (URL, status, key signals).search→MATCHES.mdwith hit excerpts + URLs ranked by relevance.
- Synthesise
FIRECRAWL.mdat the root: target, mode, credits used, key findings (5 bullets max), open loops, recommended next skill.
Output format
Folder seo-firecrawl-{slug}-{YYYYMMDD}/:
Mode = scrape
seo-firecrawl-{slug}-{YYYYMMDD}/
├── RAW.md (markdown body)
├── META.md (og / twitter / canonical / robots / headers + parsed JSON-LD)
├── links.csv (every <a href> on the page)
├── screenshot.png (optional; only if requested)
└── FIRECRAWL.md (synthesis + handoff payload)
Mode = map
seo-firecrawl-{slug}-{YYYYMMDD}/
├── URLS.md (pattern-grouped URL list)
├── urls.csv (every URL with discovery depth, if available)
└── FIRECRAWL.md
Mode = crawl
seo-firecrawl-{slug}-{YYYYMMDD}/
├── INDEX.md (every page + status code + key signals)
├── pages/
│ ├── {slug-1}/RAW.md
│ ├── {slug-1}/META.md
│ ├── {slug-2}/RAW.md
│ └── ...
└── FIRECRAWL.md
Mode = search
seo-firecrawl-{slug}-{YYYYMMDD}/
├── MATCHES.md (hit excerpts + URLs ranked by relevance)
└── FIRECRAWL.md
FIRECRAWL.md follows this shape:
# Firecrawl: {target}
> Run dated {YYYY-MM-DD} · Mode: {scrape | map | crawl | search} · Credits used: {n}
## Summary
{One-paragraph what-came-back. Example: "Scraped https://example.com/article. og:title and og:image present, JSON-LD Article schema with author + datePublished. 12 outbound links. Page is server-rendered (no JS-render divergence). Robots: index,follow."}
## Key findings
1. {Finding anchored in concrete data}
2. ...
5. ...
## Open loops
- {What this run did NOT answer}
- ...
## Recommended next step
{One of: `seo-page` (when a single URL was scraped and now wants performance analysis) | `seo-schema` (when JSON-LD audit needs follow-up generation) | `seo-technical-audit` (when crawl revealed broken pages) | `seo-content-audit` (when crawl produced a corpus to audit) | `seo-drift baseline` (when the user wants to track this URL over time) | "this completes the user's ask".}
## Handoff payload
- **Produced by:** seo-firecrawl
- **Target:** {url or domain}
- **Mode:** {scrape | map | crawl | search}
- **Credits used:** {n}
- **Key findings:** {5 bullets — e.g., "twitter:card present (summary_large_image)", "JSON-LD types: Article + Organization + BreadcrumbList", "robots: index,follow", "canonical self-referencing", "404s: 0 of 50 pages crawled"}
- **Open loops:** {what this didn't answer}
- **Recommended next skill:** {seo-page | seo-schema | seo-technical-audit | seo-content-audit | …} — {one-line why}
Tips
- Free tier 500 cr/month.
mapis 0.5 cr/URL;scrapeis 1 cr each;crawl1 cr/page. Surface cost up front; warn when a single run will eat >100 credits. - Default
onlyMainContent: trueforscrapeto drop nav/footer noise. Override only if the user explicitly asks for full-page DOM. - Use
waitFor(CSS selector or ms) for SPAs that lazy-load content. 2000ms is a sensible default; selectors are more reliable than time waits. firecrawl_mapbeforefirecrawl_crawlwhen crawl scope is unclear — discover first, decide what to crawl, then crawl. Saves credits.includePaths/excludePathsdramatically cut crawl cost. Always passexcludePaths: ["/admin/*", "/api/*", "/wp-admin/*", "/feed/*"]as a default.- Don't request
formats: ["screenshot"]unless the deliverable visibly uses it. It doubles per-page cost. - Don't use
firecrawl_extractorfirecrawl_deep_research. Both overlap with our own LLM analysis;firecrawl_extracthas opaque pricing on the free tier; both are explicitly out of scope forseo-skills. - Cloudflare / anti-bot: some sites (especially e-commerce, banking) block Firecrawl's scraper. Surface the error cleanly; defeating WAFs is not a goal of this skill.
- Sub-step usage. When invoked from another skill (
seo-page,seo-schema, etc.), drop theFIRECRAWL.mdsynthesis — the caller wants the rawMETA.md/RAW.md. Skip mode-2'sURLS.mdsummary too if the caller wants the rawurls.csv. - This is the entry point when you need raw HTML and don't have a more specific skill in mind. If you do —
seo-pagefor keyword/traffic verdicts on one URL,seo-schemafor JSON-LD work,seo-technical-auditfor crawl-wide issues — use those instead. They orchestrate Firecrawl plus SE Ranking data automatically.
Works well with
- Predecessors: none (entry point) or invoked as a sub-step from another skill.
- Successors:
seo-page— when a single URL was scraped and now wants keyword/traffic verdicts.seo-schema— when JSON-LD audit produced gaps that need generation.seo-technical-audit— when a crawl revealed broken pages or noindex issues at scale.seo-content-audit— when a crawl produced a corpus to E-E-A-T-audit.seo-drift baseline— when the user wants to track this URL or domain over time.
Capabilities
skillsource-serankingskill-seo-firecrawltopic-agent-skillstopic-ai-searchtopic-anthropictopic-backlinkstopic-claudetopic-claude-codetopic-claude-plugintopic-claude-skillstopic-content-brieftopic-ga4topic-keyword-researchtopic-mcp
Install
Installnpx skills add seranking/seo-skills
Transportskills-sh
Protocolskill
Quality
0.45/ 1.00
deterministic score 0.45 from registry signals: · indexed on github topic:agent-skills · 9 github stars · SKILL.md body (8,393 chars)
Provenance
Indexed fromgithub
Enriched2026-05-18 19:08:36Z · deterministic:skill-github:v1 · v1
First seen2026-05-18
Last seen2026-05-18