Skillquality 0.45

meta:media

Multimodal memory — ingest, embed, and search media (images, video, audio, files) with Gemini Embedding 2 + ChromaDB

Price

free

Protocol

skill

Verified

Endpoint

https://skills.sh/rkz91/coco/media-memory

What it does

/media-memory — Multimodal Memory System

You have access to a persistent multimodal memory system at ~/.claude/media-memory/. It stores every piece of media (images, video, audio, files) with rich metadata and Gemini Embedding 2 vectors in ChromaDB.

Directory Layout

~/.claude/media-memory/
  assets/          # stored media files
  chroma/          # ChromaDB vector store
  metadata.db      # SQLite structured metadata
  scripts/
    ingest.py      # ingestion + embedding
    search.py      # search with filters
    schema.py      # metadata models

Commands

All commands run from ~/.claude/media-memory/ using uv run.

Ingest (store + embed)

cd ~/.claude/media-memory && uv run scripts/ingest.py "<file_path>" \
  --source "user|generated|url|ingested" \
  --description "Natural language description of the media" \
  --tags "tag1,tag2,tag3" \
  --type "image|video|audio|document|file" \
  --text "Extracted text or transcript content"

Search (hybrid: semantic + metadata)

cd ~/.claude/media-memory && uv run scripts/search.py "search query" \
  --type image \
  --source user \
  --tags "architecture,diagram" \
  --from "2026-03-01" \
  --to "2026-03-28" \
  --limit 10 \
  --mode hybrid|semantic|metadata \
  --json

Recent items

cd ~/.claude/media-memory && uv run scripts/search.py --recent --limit 10

Stats

cd ~/.claude/media-memory && uv run scripts/search.py --stats

Behavior Rules

On Ingest (when user sends or generates media)

Copy the file to assets/ via ingest.py
ALWAYS provide --description with a rich natural language description of the content
ALWAYS provide relevant --tags for semantic categorization
Set --source accurately: user (user sent it), generated (Claude/AI created it), url (downloaded), ingested (bulk import)
For screenshots: describe what's visible (UI elements, text, code, diagrams)
For documents: extract key text into --text
Report the result to the user: "Saved to media memory: {description}"

On Search (when user asks about past media)

Use --mode hybrid by default (combines semantic + metadata)
Add --type filter when user specifies media kind
Add --tags filter when user mentions categories
Add date filters when user references timeframes ("last week", "this month")
Show results with descriptions and asset paths
Offer to open/display the asset if it's an image

Proactive Recall

When a conversation topic overlaps with stored media:

Run a quick semantic search with the current topic
If relevant results found (similarity > 0.7), mention: "I found a related {type} in media memory: {description}"
Don't be noisy — only surface genuinely relevant assets

Environment

No API key needed — uses ChromaDB's built-in local embeddings (all-MiniLM-L6-v2 via onnxruntime)
Everything runs locally, zero external calls
ChromaDB: local persistent storage, cosine similarity
Model cached at ~/.cache/chroma/onnx_models/ (downloaded once on first use)

Metadata Schema

Field	Type	Description
id	string	Auto-generated: `{type}_{hash}_{stem}`
filename	string	Original filename
type	string	image, video, audio, document, file
timestamp	ISO 8601	When ingested
source	string	user, generated, url, ingested
description	string	Natural language description
extracted_text	string	OCR / transcript / content
tags	JSON array	Semantic tags
original_path	string	Where it came from
asset_path	string	Path in assets/
embedded	boolean	Whether vector is in ChromaDB

Capabilities

skillsource-rkz91skill-media-memorytopic-agent-skillstopic-agents-mdtopic-ai-agentstopic-claude-codetopic-codextopic-cursortopic-developer-toolstopic-llm-toolstopic-mcptopic-pm-toolstopic-product-managementtopic-productivity

Install

Installnpx skills add rkz91/coco

Sourcehttps://github.com/rkz91/coco/tree/main/skills/media-memory

skills.shhttps://skills.sh/rkz91/coco/media-memory

Transportskills-sh

Protocolskill

Quality

0.45/ 1.00

deterministic score 0.45 from registry signals: · indexed on github topic:agent-skills · 7 github stars · SKILL.md body (3,717 chars)

Provenance

Indexed fromgithub

Enriched2026-05-18 19:14:07Z · deterministic:skill-github:v1 · v1

First seen2026-05-18

Last seen2026-05-18

Agent access

JSONhttps://clawmart.sh/api/listings/HBEGKN