{"id":"4d911098-3c52-46c6-a8fd-51e83d396ead","shortId":"HBEGKN","kind":"skill","title":"meta:media","tagline":"Multimodal memory — ingest, embed, and search media (images, video, audio, files) with Gemini Embedding 2 + ChromaDB","description":"# /media-memory — Multimodal Memory System\n\nYou have access to a persistent multimodal memory system at `~/.claude/media-memory/`. It stores every piece of media (images, video, audio, files) with rich metadata and Gemini Embedding 2 vectors in ChromaDB.\n\n## Directory Layout\n```\n~/.claude/media-memory/\n  assets/          # stored media files\n  chroma/          # ChromaDB vector store\n  metadata.db      # SQLite structured metadata\n  scripts/\n    ingest.py      # ingestion + embedding\n    search.py      # search with filters\n    schema.py      # metadata models\n```\n\n## Commands\n\nAll commands run from `~/.claude/media-memory/` using `uv run`.\n\n### Ingest (store + embed)\n```bash\ncd ~/.claude/media-memory && uv run scripts/ingest.py \"<file_path>\" \\\n  --source \"user|generated|url|ingested\" \\\n  --description \"Natural language description of the media\" \\\n  --tags \"tag1,tag2,tag3\" \\\n  --type \"image|video|audio|document|file\" \\\n  --text \"Extracted text or transcript content\"\n```\n\n### Search (hybrid: semantic + metadata)\n```bash\ncd ~/.claude/media-memory && uv run scripts/search.py \"search query\" \\\n  --type image \\\n  --source user \\\n  --tags \"architecture,diagram\" \\\n  --from \"2026-03-01\" \\\n  --to \"2026-03-28\" \\\n  --limit 10 \\\n  --mode hybrid|semantic|metadata \\\n  --json\n```\n\n### Recent items\n```bash\ncd ~/.claude/media-memory && uv run scripts/search.py --recent --limit 10\n```\n\n### Stats\n```bash\ncd ~/.claude/media-memory && uv run scripts/search.py --stats\n```\n\n## Behavior Rules\n\n### On Ingest (when user sends or generates media)\n1. Copy the file to `assets/` via `ingest.py`\n2. ALWAYS provide `--description` with a rich natural language description of the content\n3. ALWAYS provide relevant `--tags` for semantic categorization\n4. Set `--source` accurately: `user` (user sent it), `generated` (Claude/AI created it), `url` (downloaded), `ingested` (bulk import)\n5. For screenshots: describe what's visible (UI elements, text, code, diagrams)\n6. For documents: extract key text into `--text`\n7. Report the result to the user: \"Saved to media memory: {description}\"\n\n### On Search (when user asks about past media)\n1. Use `--mode hybrid` by default (combines semantic + metadata)\n2. Add `--type` filter when user specifies media kind\n3. Add `--tags` filter when user mentions categories\n4. Add date filters when user references timeframes (\"last week\", \"this month\")\n5. Show results with descriptions and asset paths\n6. Offer to open/display the asset if it's an image\n\n### Proactive Recall\nWhen a conversation topic overlaps with stored media:\n1. Run a quick semantic search with the current topic\n2. If relevant results found (similarity > 0.7), mention: \"I found a related {type} in media memory: {description}\"\n3. Don't be noisy — only surface genuinely relevant assets\n\n## Environment\n- **No API key needed** — uses ChromaDB's built-in local embeddings (all-MiniLM-L6-v2 via onnxruntime)\n- Everything runs locally, zero external calls\n- ChromaDB: local persistent storage, cosine similarity\n- Model cached at `~/.cache/chroma/onnx_models/` (downloaded once on first use)\n\n## Metadata Schema\n| Field | Type | Description |\n|-------|------|-------------|\n| id | string | Auto-generated: `{type}_{hash}_{stem}` |\n| filename | string | Original filename |\n| type | string | image, video, audio, document, file |\n| timestamp | ISO 8601 | When ingested |\n| source | string | user, generated, url, ingested |\n| description | string | Natural language description |\n| extracted_text | string | OCR / transcript / content |\n| tags | JSON array | Semantic tags |\n| original_path | string | Where it came from |\n| asset_path | string | Path in assets/ |\n| embedded | boolean | Whether vector is in ChromaDB |","tags":["media","memory","coco","rkz91","agent-skills","agents-md","ai-agents","claude-code","codex","cursor","developer-tools","llm-tools"],"capabilities":["skill","source-rkz91","skill-media-memory","topic-agent-skills","topic-agents-md","topic-ai-agents","topic-claude-code","topic-codex","topic-cursor","topic-developer-tools","topic-llm-tools","topic-mcp","topic-pm-tools","topic-product-management","topic-productivity"],"categories":["coco"],"synonyms":[],"warnings":[],"endpointUrl":"https://skills.sh/rkz91/coco/media-memory","protocol":"skill","transport":"skills-sh","auth":{"type":"none","details":{"cli":"npx skills add rkz91/coco","source_repo":"https://github.com/rkz91/coco","install_from":"skills.sh"}},"qualityScore":"0.453","qualityRationale":"deterministic score 0.45 from registry signals: · indexed on github topic:agent-skills · 7 github stars · SKILL.md body (3,717 chars)","verified":false,"liveness":"unknown","lastLivenessCheck":null,"agentReviews":{"count":0,"score_avg":null,"cost_usd_avg":null,"success_rate":null,"latency_p50_ms":null,"narrative_summary":null,"summary_updated_at":null},"enrichmentModel":"deterministic:skill-github:v1","enrichmentVersion":1,"enrichedAt":"2026-05-18T19:14:07.521Z","embedding":null,"createdAt":"2026-05-18T13:21:40.008Z","updatedAt":"2026-05-18T19:14:07.521Z","lastSeenAt":"2026-05-18T19:14:07.521Z","tsv":"'-01':148 '-03':147,151 '-28':152 '/.cache/chroma/onnx_models':414 '/.claude/media-memory':33,56,85,94,132,164,174 '/media-memory':19 '0.7':358 '1':189,275,342 '10':154,170 '2':17,50,197,284,352 '2026':146,150 '3':210,293,369 '4':218,301 '5':235,313 '6':247,321 '7':255 '8601':446 'access':25 'accur':221 'add':285,294,302 'all-minilm-l6-v2':392 'alway':198,211 'api':381 'architectur':143 'array':468 'ask':271 'asset':57,194,319,326,378,478,483 'audio':12,42,117,441 'auto':428 'auto-gener':427 'bash':92,130,162,172 'behavior':179 'boolean':485 'built':388 'built-in':387 'bulk':233 'cach':412 'call':404 'came':476 'categor':217 'categori':300 'cd':93,131,163,173 'chroma':61 'chromadb':18,53,62,385,405,490 'claude/ai':227 'code':245 'combin':281 'command':80,82 'content':125,209,465 'convers':336 'copi':190 'cosin':409 'creat':228 'current':350 'date':303 'default':280 'describ':238 'descript':103,106,200,206,266,317,368,424,455,459 'diagram':144,246 'directori':54 'document':118,249,442 'download':231,415 'element':243 'emb':6,91 'embed':16,49,72,391,484 'environ':379 'everi':36 'everyth':399 'extern':403 'extract':121,250,460 'field':422 'file':13,43,60,119,192,443 'filenam':433,436 'filter':76,287,296,304 'first':418 'found':356,361 'gemini':15,48 'generat':100,187,226,429,452 'genuin':376 'hash':431 'hybrid':127,156,278 'id':425 'imag':10,40,115,139,331,439 'import':234 'ingest':5,71,89,102,182,232,448,454 'ingest.py':70,196 'iso':445 'item':161 'json':159,467 'key':251,382 'kind':292 'l6':395 'languag':105,205,458 'last':309 'layout':55 'limit':153,169 'local':390,401,406 'media':2,9,39,59,109,188,264,274,291,341,366 'memori':4,21,30,265,367 'mention':299,359 'meta':1 'metadata':46,68,78,129,158,283,420 'metadata.db':65 'minilm':394 'mode':155,277 'model':79,411 'month':312 'multimod':3,20,29 'natur':104,204,457 'need':383 'noisi':373 'ocr':463 'offer':322 'onnxruntim':398 'open/display':324 'origin':435,471 'overlap':338 'past':273 'path':320,472,479,481 'persist':28,407 'piec':37 'proactiv':332 'provid':199,212 'queri':137 'quick':345 'recal':333 'recent':160,168 'refer':307 'relat':363 'relev':213,354,377 'report':256 'result':258,315,355 'rich':45,203 'rule':180 'run':83,88,96,134,166,176,343,400 'save':262 'schema':421 'schema.py':77 'screenshot':237 'script':69 'scripts/ingest.py':97 'scripts/search.py':135,167,177 'search':8,74,126,136,268,347 'search.py':73 'semant':128,157,216,282,346,469 'send':185 'sent':224 'set':219 'show':314 'similar':357,410 'skill' 'skill-media-memory' 'sourc':98,140,220,449 'source-rkz91' 'specifi':290 'sqlite':66 'stat':171,178 'stem':432 'storag':408 'store':35,58,64,90,340 'string':426,434,438,450,456,462,473,480 'structur':67 'surfac':375 'system':22,31 'tag':110,142,214,295,466,470 'tag1':111 'tag2':112 'tag3':113 'text':120,122,244,252,254,461 'timefram':308 'timestamp':444 'topic':337,351 'topic-agent-skills' 'topic-agents-md' 'topic-ai-agents' 'topic-claude-code' 'topic-codex' 'topic-cursor' 'topic-developer-tools' 'topic-llm-tools' 'topic-mcp' 'topic-pm-tools' 'topic-product-management' 'topic-productivity' 'transcript':124,464 'type':114,138,286,364,423,430,437 'ui':242 'url':101,230,453 'use':86,276,384,419 'user':99,141,184,222,223,261,270,289,298,306,451 'uv':87,95,133,165,175 'v2':396 'vector':51,63,487 'via':195,397 'video':11,41,116,440 'visibl':241 'week':310 'whether':486 'zero':402","prices":[{"id":"95982371-2d2d-4b2d-9153-084c07208ffa","listingId":"4d911098-3c52-46c6-a8fd-51e83d396ead","amountUsd":"0","unit":"free","nativeCurrency":null,"nativeAmount":null,"chain":null,"payTo":null,"paymentMethod":"skill-free","isPrimary":true,"details":{"org":"rkz91","category":"coco","install_from":"skills.sh"},"createdAt":"2026-05-18T13:21:40.008Z"}],"sources":[{"listingId":"4d911098-3c52-46c6-a8fd-51e83d396ead","source":"github","sourceId":"rkz91/coco/media-memory","sourceUrl":"https://github.com/rkz91/coco/tree/main/skills/media-memory","isPrimary":false,"firstSeenAt":"2026-05-18T13:21:40.008Z","lastSeenAt":"2026-05-18T19:14:07.521Z"}],"details":{"listingId":"4d911098-3c52-46c6-a8fd-51e83d396ead","quickStartSnippet":null,"exampleRequest":null,"exampleResponse":null,"schema":null,"openapiUrl":null,"agentsTxtUrl":null,"citations":[],"useCases":[],"bestFor":[],"notFor":[],"kindDetails":{"org":"rkz91","slug":"media-memory","github":{"repo":"rkz91/coco","stars":7,"topics":["agent-skills","agents-md","ai","ai-agents","claude-code","codex","cursor","developer-tools","llm-tools","mcp","pm-tools","product-management","productivity","prompt-engineering","workflow-automation"],"license":"mit","html_url":"https://github.com/rkz91/coco","pushed_at":"2026-04-26T01:51:27Z","description":"Open-source library of AI superpowers — 59 skills, 34 commands, 10 agents + 24 GSD subagents, 3 system bundles. An entire team, wherever your AI lives. Vendor-neutral across Claude Code, Cursor, Codex, and any AGENTS.md tool.","skill_md_sha":"8a75849d06c3a2dec7a23c9f817ddf2d50e52f72","skill_md_path":"skills/media-memory/SKILL.md","default_branch":"main","skill_tree_url":"https://github.com/rkz91/coco/tree/main/skills/media-memory"},"layout":"multi","source":"github","category":"coco","frontmatter":{"name":"meta:media","description":"Multimodal memory — ingest, embed, and search media (images, video, audio, files) with Gemini Embedding 2 + ChromaDB"},"skills_sh_url":"https://skills.sh/rkz91/coco/media-memory"},"updatedAt":"2026-05-18T19:14:07.521Z"}}