{"id":"4989daa8-19ae-4180-a68b-2f1c246907df","shortId":"FZ5HVz","kind":"skill","title":"ingest-youtube","tagline":"Pull a YouTube video transcript into a queryable markdown vault with yt-dlp subtitle discovery, VTT cleanup, metadata frontmatter, and capture-seed stubs.","description":"# ingest-youtube — YouTube-to-vault connector\n\nPulls YouTube transcripts into a markdown vault as queryable typed-memory entries that downstream skills (knowledge graph extraction, voice-fingerprint training, content repurposing, action-item extraction) can act on.\n\nSame pattern as ingest-slack, ingest-whatsapp, ingest-notion, ingest-linear, ingest-github, ingest-gmail. Adding YouTube means a new normalizer, not a new architecture.\n\n## When to use\n\n- User pastes a YouTube URL and asks for a transcript or summary\n- User says `/ingest-youtube <url>` for a single video\n- User asks to capture, sync, ingest, transcribe, or pull a talk/podcast/keynote into the vault\n\nDo NOT use for:\n- Downloading the actual video file (use `yt-dlp` directly with `-f best`)\n- Channel-wide ingestion or `--days` windows; this script ingests one video URL at a time\n- Live streams (transcripts are not stable)\n- Non-YouTube sources (Vimeo, Twitch, Twitter Spaces have their own connectors)\n- One-off transcript reads where the user does not want a vault file (run `yt-dlp --write-auto-sub` directly and pipe to stdout)\n\n## How it works\n\n1. Parse the input as one YouTube video URL.\n2. Verify `yt-dlp` is installed. If not, the script exits with install instructions: `brew install yt-dlp` (macOS) or `pip3 install --user yt-dlp`.\n3. Call `yt-dlp --list-subs <url>` to enumerate available subtitles.\n4. Subtitle priority: manual subs > auto-generated captions. Manual subs preserve creator-provided punctuation and speaker labels; auto-gen is uppercase + no punctuation.\n5. Download the highest-priority subtitle as VTT via `yt-dlp --write-sub --sub-lang <lang> --skip-download`. Default language preference: `en,es` (English first, Spanish second).\n6. Strip VTT timing markers and merge into clean prose paragraphs. Deduplicate repeated lines (auto-generated VTTs are line-doubled). Preserve speaker labels if the source had them.\n7. Pull video metadata (title, channel, upload date, duration, video_id, URL) via `yt-dlp --print-json --skip-download`.\n8. Slugify the channel name and video title. Write to `External Inputs/YouTube/<channel-slug>/<YYYY-MM-DD>-<video-slug>.md`.\n9. Scan transcript for trigger keywords (decision, framework, model, principle, \"the lesson is\", playbook, anti-pattern, case study). For each match, create a writing-seed stub at `Meta/Captures/<YYYY-MM-DD>-youtube-<channel-slug>-<video-id>.md` so the seed lands in the captures aggregator.\n10. Print summary: file path, transcript word count, language, seeds detected.\n\n## Invocation\n\n```bash\npython3 ingest.py <youtube-url> [--vault <path>] [--lang <code>]\n```\n\nDefaults:\n- `--vault`: `$VAULT_ROOT` env var or current directory\n- `--lang`: `en,es` (English first, Spanish second; matches a common bilingual default)\n- `--whisper`: accepted as a future fallback flag, but this version writes a stub when no subtitles are available\n\n## Output contract\n\nThe vault file at `External Inputs/YouTube/<channel-slug>/<YYYY-MM-DD>-<video-slug>.md` has frontmatter:\n\n```yaml\n---\ntype: external-input\nsource: youtube\nvideo_id: <11-char ID>\nurl: https://www.youtube.com/watch?v=<id>\nchannel: <channel-name>\nchannel_url: https://www.youtube.com/<handle>\ntitle: <video title>\nupload_date: <YYYY-MM-DD>\nduration_seconds: <int>\nlanguage: <ISO code>\nsubtitle_source: manual | auto | whisper\nword_count: <int>\ningested_at: <ISO 8601 timestamp>\n---\n```\n\nBody is the cleaned transcript as paragraph prose. If the source had speaker labels, format as `**<speaker>:** <text>` per turn.\n\n## Idempotency\n\nRe-ingesting the same video URL overwrites the same vault file. The seed stub filenames hash the video_id, so the same source video produces the same stub filename across re-runs. Re-runs refresh, never duplicate.\n\n## Missing subtitles\n\nIf `yt-dlp --list-subs` returns no manual or auto subtitles, the script writes a stub vault note with the video metadata and source URL instead of failing silently. The `--whisper` flag is reserved for a future local transcription fallback and currently reports that the fallback is not implemented.\n\nFor a manual fallback today, download audio with `yt-dlp`, transcribe it with your local Whisper workflow, and add captions or transcript text before rerunning the ingest.\n\n## Limitations\n\n- Ingests one YouTube video URL per run; channel handles, playlists, and `--days` windows are out of scope.\n- Depends on subtitles returned by `yt-dlp`; videos without subtitles produce a metadata stub, not a transcript.\n- Does not download video files or perform built-in Whisper transcription in this version.\n- Network availability, YouTube subtitle access, and local `yt-dlp` behavior determine whether ingest succeeds.\n\n## Acceptance test\n\nRun against the first YouTube video ever uploaded:\n\n```bash\npython3 ingest.py \"https://www.youtube.com/watch?v=jNQXAC9IVRw\" --vault /tmp/test\n```\n\nExpected output:\n```\nWrote 39 words to /tmp/test/External Inputs/YouTube/jawed/2005-04-24-me-at-the-zoo.md. Language: en. Subtitle source: manual.\n```\n\nThe output file contains valid frontmatter and a clean prose body.\n\n## Dependencies\n\n- `yt-dlp` (required): install via `brew install yt-dlp` or `pip3 install --user yt-dlp`\n- `whisper-cpp` (optional for a manual fallback outside this script)\n\n## Source\n\nBundled in [adelaidasofia/ai-brain-starter](https://github.com/adelaidasofia/ai-brain-starter), a verification harness around an AI agent so memory compounds instead of corrupts. The skill is part of the ingest-* family of vault connectors.","tags":["ingest","youtube","antigravity","awesome","skills","sickn33","agent-skills","agentic-skills","ai-agent-skills","ai-agents","ai-coding","ai-workflows"],"capabilities":["skill","source-sickn33","skill-ingest-youtube","topic-agent-skills","topic-agentic-skills","topic-ai-agent-skills","topic-ai-agents","topic-ai-coding","topic-ai-workflows","topic-antigravity","topic-antigravity-skills","topic-claude-code","topic-claude-code-skills","topic-codex-cli","topic-codex-skills"],"categories":["antigravity-awesome-skills"],"synonyms":[],"warnings":[],"endpointUrl":"https://skills.sh/sickn33/antigravity-awesome-skills/ingest-youtube","protocol":"skill","transport":"skills-sh","auth":{"type":"none","details":{"cli":"npx skills add sickn33/antigravity-awesome-skills","source_repo":"https://github.com/sickn33/antigravity-awesome-skills","install_from":"skills.sh"}},"qualityScore":"0.700","qualityRationale":"deterministic score 0.70 from registry signals: · indexed on github topic:agent-skills · 37911 github stars · SKILL.md body (5,512 chars)","verified":false,"liveness":"unknown","lastLivenessCheck":null,"agentReviews":{"count":0,"score_avg":null,"cost_usd_avg":null,"success_rate":null,"latency_p50_ms":null,"narrative_summary":null,"summary_updated_at":null},"enrichmentModel":"deterministic:skill-github:v1","enrichmentVersion":1,"enrichedAt":"2026-05-18T18:51:13.421Z","embedding":null,"createdAt":"2026-05-11T18:51:24.498Z","updatedAt":"2026-05-18T18:51:13.421Z","lastSeenAt":"2026-05-18T18:51:13.421Z","tsv":"'/adelaidasofia/ai-brain-starter),':814 '/ingest-youtube':117 '/tmp/test':753 '/tmp/test/external':760 '/watch?v=':510 '/watch?v=jnqxac9ivrw':751 '1':217 '10':428 '11':504 '2':226 '3':254 '39':757 '4':266 '5':292 '6':323 '7':353 '8':375 '9':388 'accept':467,736 'access':725 'across':579 'act':67 'action':63 'action-item':62 'actual':142 'ad':90 'add':661 'adelaidasofia/ai-brain-starter':811 'agent':821 'aggreg':427 'ai':820 'anti':403 'anti-pattern':402 'architectur':99 'around':818 'ask':109,123 'audio':648 'auto':207,272,286,338,524,602 'auto-gen':285 'auto-gener':271,337 'avail':264,483,722 'bash':440,746 'behavior':731 'best':152 'bilingu':464 'bodi':530,777 'brew':241,785 'built':714 'built-in':713 'bundl':809 'call':255 'caption':274,662 'captur':26,125,426 'capture-se':25 'case':405 'channel':154,358,378,511,512,678 'channel-wid':153 'char':505 'clean':331,533,775 'cleanup':21 'common':463 'compound':824 'connector':36,186,838 'contain':770 'content':60 'contract':485 'corrupt':827 'count':435,527 'cpp':799 'creat':410 'creator':279 'creator-provid':278 'current':452,634 'date':360,517 'day':158,682 'decis':394 'dedupl':334 'default':314,445,465 'depend':688,778 'detect':438 'determin':732 'direct':149,209 'directori':453 'discoveri':19 'dlp':17,148,204,230,245,253,258,304,368,594,652,695,730,781,789,796 'doubl':344 'download':140,293,313,374,647,708 'downstream':51 'duplic':588 'durat':361,518 'en':317,455,763 'english':319,457 'entri':49 'enumer':263 'env':449 'es':318,456 'ever':744 'exit':237 'expect':754 'extern':385,490,498 'external-input':497 'extract':55,65 'f':151 'fail':620 'fallback':471,632,638,645,804 'famili':835 'file':144,200,431,488,560,710,769 'filenam':564,578 'fingerprint':58 'first':320,458,741 'flag':472,624 'format':544 'framework':395 'frontmatt':23,494,772 'futur':470,629 'gen':287 'generat':273,339 'github':86 'github.com':813 'github.com/adelaidasofia/ai-brain-starter),':812 'gmail':89 'graph':54 'handl':679 'har':817 'hash':565 'highest':296 'highest-prior':295 'id':363,503,506,568 'idempot':548 'implement':641 'ingest':2,30,73,76,79,82,85,88,127,156,162,528,551,669,671,734,834 'ingest-github':84 'ingest-gmail':87 'ingest-linear':81 'ingest-not':78 'ingest-slack':72 'ingest-whatsapp':75 'ingest-youtub':1,29 'ingest.py':442,748 'input':220,499 'inputs/youtube':386,491 'inputs/youtube/jawed/2005-04-24-me-at-the-zoo.md':761 'instal':232,239,242,249,783,786,792 'instead':618,825 'instruct':240 'invoc':439 'item':64 'json':371 'keyword':393 'knowledg':53 'label':284,347,543 'land':423 'lang':310,444,454 'languag':315,436,520,762 'lesson':399 'limit':670 'line':336,343 'line-doubl':342 'linear':83 'list':260,596 'list-sub':259,595 'live':169 'local':630,657,727 'maco':246 'manual':269,275,523,600,644,766,803 'markdown':12,42 'marker':327 'match':409,461 'md':387,419,492 'mean':92 'memori':48,823 'merg':329 'meta/captures':417 'metadata':22,356,614,701 'miss':589 'model':396 'name':379 'network':721 'never':587 'new':94,98 'non':176 'non-youtub':175 'normal':95 'note':610 'notion':80 'one':163,188,222,672 'one-off':187 'option':800 'output':484,755,768 'outsid':805 'overwrit':556 'paragraph':333,536 'pars':218 'part':831 'past':104 'path':432 'pattern':70,404 'per':546,676 'perform':712 'pip3':248,791 'pipe':211 'playbook':401 'playlist':680 'prefer':316 'preserv':277,345 'principl':397 'print':370,429 'print-json':369 'prioriti':268,297 'produc':574,699 'prose':332,537,776 'provid':280 'pull':4,37,130,354 'punctuat':281,291 'python3':441,747 'queryabl':11,45 're':550,581,584 're-ingest':549 're-run':580,583 'read':191 'refresh':586 'repeat':335 'report':635 'repurpos':61 'requir':782 'rerun':667 'reserv':626 'return':598,691 'root':448 'run':201,582,585,677,738 'say':116 'scan':389 'scope':687 'script':161,236,605,807 'second':322,460,519 'seed':27,414,422,437,562 'silent':621 'singl':120 'skill':52,829 'skill-ingest-youtube' 'skip':312,373 'skip-download':311,372 'slack':74 'slugifi':376 'sourc':178,350,500,522,540,572,616,765,808 'source-sickn33' 'space':182 'spanish':321,459 'speaker':283,346,542 'stabl':174 'stdout':213 'stream':170 'strip':324 'stub':28,415,478,563,577,608,702 'studi':406 'sub':208,261,270,276,307,309,597 'sub-lang':308 'subtitl':18,265,267,298,481,521,590,603,690,698,724,764 'succeed':735 'summari':114,430 'sync':126 'talk/podcast/keynote':132 'test':737 'text':665 'time':168,326 'titl':357,382,515 'today':646 'topic-agent-skills' 'topic-agentic-skills' 'topic-ai-agent-skills' 'topic-ai-agents' 'topic-ai-coding' 'topic-ai-workflows' 'topic-antigravity' 'topic-antigravity-skills' 'topic-claude-code' 'topic-claude-code-skills' 'topic-codex-cli' 'topic-codex-skills' 'train':59 'transcrib':128,653 'transcript':8,39,112,171,190,390,433,534,631,664,705,717 'trigger':392 'turn':547 'twitch':180 'twitter':181 'type':47,496 'typed-memori':46 'upload':359,516,745 'uppercas':289 'url':107,165,225,364,507,513,555,617,675 'use':102,138,145 'user':103,115,122,194,250,793 'valid':771 'var':450 'vault':13,35,43,135,199,443,446,447,487,559,609,752,837 'verif':816 'verifi':227 'version':475,720 'via':301,365,784 'video':7,121,143,164,224,355,362,381,502,554,567,573,613,674,696,709,743 'vimeo':179 'voic':57 'voice-fingerprint':56 'vtt':20,300,325 'vtts':340 'want':197 'whatsapp':77 'whether':733 'whisper':466,525,623,658,716,798 'whisper-cpp':797 'wide':155 'window':159,683 'without':697 'word':434,526,758 'work':216 'workflow':659 'write':206,306,383,413,476,606 'write-auto-sub':205 'write-sub':305 'writing-se':412 'wrote':756 'www.youtube.com':509,514,750 'www.youtube.com/watch?v=':508 'www.youtube.com/watch?v=jnqxac9ivrw':749 'yaml':495 'youtub':3,6,31,33,38,91,106,177,223,418,501,673,723,742 'youtube-to-vault':32 'yt':16,147,203,229,244,252,257,303,367,593,651,694,729,780,788,795 'yt-dlp':15,146,202,228,243,251,256,302,366,592,650,693,728,779,787,794","prices":[{"id":"918f1c73-dd7e-41ed-af2a-99a08ba889e3","listingId":"4989daa8-19ae-4180-a68b-2f1c246907df","amountUsd":"0","unit":"free","nativeCurrency":null,"nativeAmount":null,"chain":null,"payTo":null,"paymentMethod":"skill-free","isPrimary":true,"details":{"org":"sickn33","category":"antigravity-awesome-skills","install_from":"skills.sh"},"createdAt":"2026-05-11T18:51:24.498Z"}],"sources":[{"listingId":"4989daa8-19ae-4180-a68b-2f1c246907df","source":"github","sourceId":"sickn33/antigravity-awesome-skills/ingest-youtube","sourceUrl":"https://github.com/sickn33/antigravity-awesome-skills/tree/main/skills/ingest-youtube","isPrimary":false,"firstSeenAt":"2026-05-11T18:51:24.498Z","lastSeenAt":"2026-05-18T18:51:13.421Z"}],"details":{"listingId":"4989daa8-19ae-4180-a68b-2f1c246907df","quickStartSnippet":null,"exampleRequest":null,"exampleResponse":null,"schema":null,"openapiUrl":null,"agentsTxtUrl":null,"citations":[],"useCases":[],"bestFor":[],"notFor":[],"kindDetails":{"org":"sickn33","slug":"ingest-youtube","github":{"repo":"sickn33/antigravity-awesome-skills","stars":37911,"topics":["agent-skills","agentic-skills","ai-agent-skills","ai-agents","ai-coding","ai-workflows","antigravity","antigravity-skills","claude-code","claude-code-skills","codex-cli","codex-skills","cursor","cursor-skills","developer-tools","gemini-cli","gemini-skills","kiro","mcp","skill-library"],"license":"mit","html_url":"https://github.com/sickn33/antigravity-awesome-skills","pushed_at":"2026-05-18T08:24:49Z","description":"Installable GitHub library of 1,400+ agentic skills for Claude Code, Cursor, Codex CLI, Gemini CLI, Antigravity, and more. Includes installer CLI, bundles, workflows, and official/community skill collections.","skill_md_sha":"3a3a4e069fc40a272d5c27bb821a8e8d293cf9ef","skill_md_path":"skills/ingest-youtube/SKILL.md","default_branch":"main","skill_tree_url":"https://github.com/sickn33/antigravity-awesome-skills/tree/main/skills/ingest-youtube"},"layout":"multi","source":"github","category":"antigravity-awesome-skills","frontmatter":{"name":"ingest-youtube","license":"MIT","description":"Pull a YouTube video transcript into a queryable markdown vault with yt-dlp subtitle discovery, VTT cleanup, metadata frontmatter, and capture-seed stubs."},"skills_sh_url":"https://skills.sh/sickn33/antigravity-awesome-skills/ingest-youtube"},"updatedAt":"2026-05-18T18:51:13.421Z"}}