Skillquality 0.46

summarize

Content extraction skill for AI coding agents. YouTube transcripts, podcasts, PDFs, images (OCR), audio/video via the summarize CLI.

Price
free
Protocol
skill
Verified
no

What it does

Summarize

Extract clean text and media transcripts from URLs, files, and streams so your AI workflow can reason over reliable source content without hand-coding brittle scraper logic.

Use this skill when you need deterministic extraction for YouTube, podcast feeds, PDFs, scanned images, or local media files.

Terminology used in this file:

  • DOM: Document Object Model, the page element structure used by browser-based extractors.
  • OCR: Optical character recognition (extracting text from images/scans).
  • ANSI codes: Terminal color/control sequences; --plain removes them for machine parsing.

Setup

brew tap steipete/tap
brew install summarize
  • Claude Code: copy this skill folder into .claude/skills/summarize/
  • Codex CLI: append this SKILL.md content to your project's root AGENTS.md

For the full installation walkthrough (prerequisites, optional dependencies, verification, troubleshooting), see references/installation-guide.md.

Staying Updated

This skill ships with an UPDATES.md changelog and UPDATE-GUIDE.md for your AI agent.

After installing, tell your agent: "Check UPDATES.md in the summarize skill for any new features or changes."

When updating, tell your agent: "Read UPDATE-GUIDE.md and apply the latest changes from UPDATES.md."

Follow UPDATE-GUIDE.md so customized local files are diffed before any overwrite.


Quick Start

Run one extraction flow end-to-end:

summarize --version
summarize --extract "https://www.youtube.com/watch?v=VIDEO_ID" --plain
summarize --extract "/path/to/document.pdf" --plain

Use --extract --plain as the default pattern for deterministic, non-ANSI output.

Decision Tree: summarize vs Other Tools

Need content from the web?
  |
  +-- Static web page (article, docs, blog)?
  |     --> WebFetch (built-in, zero deps, faster)
  |     --> Jina r.jina.ai (zero install alternative)
  |     --> summarize ONLY if above tools fail or return garbage
  |
  +-- JS-heavy SPA / dynamic content?
  |     --> Crawl4AI crwl (full browser rendering)
  |     --> summarize will NOT help here (no JS rendering)
  |
  +-- Anti-bot / paywalled / Cloudflare-protected?
  |     --> summarize --firecrawl always (requires FIRECRAWL_API_KEY)
  |     --> browser-based workflow as fallback
  |
  +-- YouTube video?
  |     --> summarize --extract (ONLY option for transcript)
  |     --> Add --youtube web for captions-only (faster)
  |     --> Add --slides for visual slide extraction
  |
  +-- Podcast / RSS feed?
  |     --> summarize --extract (ONLY option)
  |     --> Supports Apple Podcasts, Spotify, RSS feeds, Podbean, etc.
  |
  +-- PDF (URL or local file)?
  |     --> summarize --extract (ONLY CLI option)
  |     --> Requires: uvx/markitdown (brew install uv)
  |
  +-- Image (OCR)?
  |     --> summarize --extract (ONLY CLI option)
  |     --> Requires: tesseract
  |
  +-- Audio / video file?
        --> summarize --extract (ONLY CLI option)
        --> Requires: whisper-cli (local) or OPENAI_API_KEY (cloud)

Rule of thumb: summarize is the default for media extraction (YouTube, podcasts, audio, video, images). For web pages, prefer WebFetch/Jina/Crawl4AI depending on DOM complexity (how hard the page structure is to parse). Use summarize for web only when other tools fail.

Extraction Mode (Primary)

--extract prints raw extracted content and exits. No LLM involved. Use this first. You can handle any downstream synthesis in your own workflow.

# Web page extraction (plain text, default)
summarize --extract "https://example.com" --plain

# Web page extraction (markdown format)
summarize --extract "https://example.com" --format md --plain

# YouTube transcript
summarize --extract "https://www.youtube.com/watch?v=VIDEO_ID" --plain

# YouTube transcript with timestamps
summarize --extract "https://www.youtube.com/watch?v=VIDEO_ID" --timestamps --plain

# YouTube transcript formatted as markdown (requires LLM -- uses API key)
summarize --extract "https://www.youtube.com/watch?v=VIDEO_ID" --format md --markdown-mode llm --plain

# YouTube slides + transcript
summarize --extract "https://www.youtube.com/watch?v=VIDEO_ID" --slides --plain

# Podcast (RSS feed)
summarize --extract "https://feeds.example.com/podcast.xml" --plain

# Apple Podcasts episode
summarize --extract "https://podcasts.apple.com/us/podcast/EPISODE_ID" --plain

# PDF from URL
summarize --extract "https://example.com/document.pdf" --plain

# PDF from local file
summarize --extract "/path/to/document.pdf" --plain

# Image OCR
summarize --extract "/path/to/image.png" --plain

# Audio transcription
summarize --extract "/path/to/audio.mp3" --plain

# Video transcription
summarize --extract "/path/to/video.mp4" --plain

# Stdin (pipe content)
pbpaste | summarize --extract - --plain
cat document.pdf | summarize --extract - --plain

Always use --plain when extracting for agent consumption. It suppresses ANSI/OSC rendering.

Extraction defaults:

  • URLs default to --format md in extract mode
  • Files default to --format text
  • PDF requires uvx/markitdown (--preprocess auto, which is default)

LLM Summarization Mode (Secondary)

Use this mode only when you explicitly want summarize to perform synthesis itself.

# Summarize a URL (requires API key for the chosen model)
summarize "https://example.com" --model anthropic/claude-sonnet-4-5 --length long

# Summarize with a custom prompt
summarize "https://example.com" --prompt "Extract key technical decisions and their rationale"

# Summarize YouTube video
summarize "https://www.youtube.com/watch?v=VIDEO_ID" --length xl

# JSON output with metrics
summarize "https://example.com" --json --model openai/gpt-5-mini

API keys for LLM mode (set in ~/.summarize/config.json or env vars):

  • ANTHROPIC_API_KEY -- for anthropic/ models
  • OPENAI_API_KEY -- for openai/ models
  • GEMINI_API_KEY -- for google/ models
  • XAI_API_KEY -- for xai/ models

Dependency Matrix

FeatureRequired Deps
Web page extractionNone
YouTube transcript (captions)None (web mode)
YouTube transcript (no captions)yt-dlp + whisper or API key
YouTube slidesyt-dlp + ffmpeg
Podcast transcriptionyt-dlp + whisper or API key
PDF extractionuvx/markitdown
Image OCRtesseract
Audio/video transcriptionwhisper-cli (local) or OPENAI_API_KEY
Anti-bot sites (Firecrawl)FIRECRAWL_API_KEY
Slide OCRtesseract

What is not installed (by design):

  • whisper-cli / whisper.cpp -- heavy binary, install when audio transcription is needed
  • Firecrawl API key -- paid service, configure when anti-bot extraction is needed
  • LLM API keys in summarize config -- only add if you use LLM Summarization Mode

Key Flags Quick Reference

FlagPurposeExample
--extractRaw content extraction, no LLMsummarize --extract URL
--plainNo ANSI rendering (agent-safe output)Always use for agents
--format md|textOutput format (md default for URLs in extract)--format md
--youtube auto|web|yt-dlpYouTube transcript source--youtube web (captions only)
--slidesExtract video slides with ffmpeg--slides --slides-ocr
--timestampsInclude timestamps in transcripts--timestamps
--firecrawl off|auto|alwaysFirecrawl for anti-bot sites--firecrawl always
--preprocess off|auto|alwaysPreprocessing (markitdown for PDFs)Default auto
--markdown-modeHTML-to-MD conversion mode--markdown-mode readability
--timeoutFetch/LLM timeout--timeout 2m
--verboseDebug output to stderrTroubleshooting
--jsonStructured JSON output with metrics--json
--lengthSummary length (LLM mode only)--length xl
--modelLLM model (LLM mode only)--model anthropic/claude-sonnet-4-5
--max-extract-charactersLimit extract output length--max-extract-characters 50000
--language|--langOutput language--lang en
--video-modeVideo handling mode--video-mode transcript
--transcriberAudio backend--transcriber whisper

Verified Services (YouTube/Podcasts)

YouTube: All public videos with captions. Falls back to yt-dlp audio download + transcription for videos without captions.

Podcasts (verified):

  • Apple Podcasts
  • Spotify (best-effort; may fail for exclusives)
  • Amazon Music / Audible podcast pages
  • Podbean
  • Podchaser
  • RSS feeds (Podcasting 2.0 transcripts when available)
  • Embedded YouTube podcast pages

Common Patterns

1. YouTube Transcript for Analysis

# Quick: captions only (fastest, no deps beyond summarize)
summarize --extract "https://www.youtube.com/watch?v=VIDEO_ID" --youtube web --plain

# Full: with timestamps
summarize --extract "https://www.youtube.com/watch?v=VIDEO_ID" --timestamps --plain

# Formatted as clean markdown (requires LLM API key)
summarize --extract "https://www.youtube.com/watch?v=VIDEO_ID" --format md --markdown-mode llm --plain

2. Podcast Episode Transcript

# From RSS feed (transcribes latest episode)
summarize --extract "https://feeds.example.com/podcast.xml" --plain

# From Apple Podcasts link
summarize --extract "https://podcasts.apple.com/us/podcast/SHOW/EPISODE" --plain

3. PDF Content Extraction

# From URL
summarize --extract "https://example.com/report.pdf" --plain

# From local file
summarize --extract "/path/to/file.pdf" --plain

# Limit output length
summarize --extract "/path/to/huge.pdf" --max-extract-characters 50000 --plain

4. Image OCR

summarize --extract "/path/to/screenshot.png" --plain
summarize --extract "/path/to/scanned-doc.jpg" --plain

5. Anti-Bot Website (Firecrawl Fallback)

# Requires FIRECRAWL_API_KEY in env or config
summarize --extract "https://paywalled-site.com/article" --firecrawl always --plain

6. Batch Extraction (Shell Loop)

# Extract multiple YouTube videos
for url in "URL1" "URL2" "URL3"; do
  echo "=== $url ==="
  summarize --extract "$url" --plain
done

Error Handling

SymptomCauseFix
Missing uvx/markitdownPDF preprocessing not availablebrew install uv
does not support extracting binary filesPreprocessing disabled for PDFUse --preprocess auto (default) with uvx installed
YouTube returns empty transcriptNo captions available, no yt-dlp/whisperInstall yt-dlp; for whisper fallback, install whisper-cli or set OPENAI_API_KEY
FIRECRAWL_API_KEY not setAnti-bot mode requires FirecrawlSet key in env or ~/.summarize/config.json
Timeout on large contentDefault 2m timeout too shortUse --timeout 5m
Audio transcription failsNo whisper backend availableInstall whisper-cli locally or set OPENAI_API_KEY/FAL_KEY
Podcast extraction failsAudio download failedCheck yt-dlp is installed and updated: brew upgrade yt-dlp
Garbled web extractionJS-rendered contentsummarize has no JS engine; use Crawl4AI instead

Configuration

Config file: ~/.summarize/config.json

{
  "model": "auto",
  "env": {
    "FIRECRAWL_API_KEY": "fc-..."
  },
  "ui": {
    "theme": "mono"
  }
}

Configure only what your workflow needs. If you use LLM Summarization Mode, add the required API keys.

Anti-Patterns

Do NOTDo Instead
Use summarize for static web pagesWebFetch or Jina (faster, zero deps)
Use summarize for JS-heavy SPAsCrawl4AI crwl (has browser rendering)
Use summarize's LLM mode as defaultUse --extract and run synthesis in your own workflow unless explicitly required
Skip --plain for any non-interactive runAlways use --plain to avoid ANSI escape codes
Install whisper.cpp preemptivelyInstall only when audio transcription use case arises
Forget --timeout for large mediaPodcasts/videos can take minutes; set --timeout 5m
Use summarize when WebFetch workssummarize is heavier; reserve for media and fallback
Use summarize for local repo/codebase searchUse your local knowledge search tools

Bundled Resources Index

PathWhatWhen to Load
./UPDATES.mdStructured changelog for AI agentsWhen checking for new features or updates
./UPDATE-GUIDE.mdInstructions for AI agents performing updatesWhen updating this skill
./references/installation-guide.mdDetailed install walkthrough for Claude Code and Codex CLIFirst-time setup or environment repair
./references/commands.mdFull CLI flag reference with all optionsWhen you need exact flag syntax or env var names

Capabilities

skillsource-buildoakskill-summarizetopic-agent-skillstopic-ai-agentstopic-ai-toolstopic-automationtopic-browser-automationtopic-claude-codetopic-claude-skillstopic-codex

Install

Installnpx skills add buildoak/fieldwork-skills
Transportskills-sh
Protocolskill

Quality

0.46/ 1.00

deterministic score 0.46 from registry signals: · indexed on github topic:agent-skills · 15 github stars · SKILL.md body (12,908 chars)

Provenance

Indexed fromgithub
Enriched2026-04-22 19:06:33Z · deterministic:skill-github:v1 · v1
First seen2026-04-18
Last seen2026-04-22

Agent access