Skillquality 0.49

monte-carlo-storage-cost-analysis

Analyze a warehouse for stale, unused, or redundant tables via the analyze_storage_costs MCP tool. Classifies waste patterns and table categories, computes safety tiers, and handles category drill-downs and lineage follow-ups.

Price

free

Protocol

skill

Verified

Endpoint

https://skills.sh/monte-carlo-data/mc-agent-toolkit/storage-cost-analysis

What it does

Monte Carlo Storage Cost Analysis Skill

This skill analyzes a data warehouse for stale tables that can be removed to reduce storage costs. It delegates classification, safety scoring, and formatting to the analyze_storage_costs MCP tool, then presents the pre-formatted result verbatim and handles follow-up questions (category drill-downs, lineage checks).

Reference file (use the Read tool to access it):

Output contract and category keywords: references/output-structure.md

When to activate this skill

Activate when the user:

Asks about storage costs, waste, or cleanup opportunities
Wants to find unused, unread, or stale tables
Asks "which tables can I drop?" or "what's costing us money?"
Mentions storage optimization, cost reduction, or warehouse cleanup
Wants to identify zombie tables, dead-end pipelines, or temporary/archive tables

When NOT to activate this skill

Do not activate when the user is:

Just querying data or exploring table contents
Creating or modifying monitors (use the monitoring-advisor skill)
Investigating data quality incidents (use the prevent skill)
Looking at pipeline performance or query cost (use the performance-diagnosis skill)

Prerequisites

The following MCP tools must be available (connect to Monte Carlo's MCP server):

analyze_storage_costs -- runs the full analysis pipeline and returns pre-formatted output
get_asset_lineage -- used only for follow-up lineage checks

The analyze_storage_costs tool supports Snowflake, BigQuery, Redshift, and Databricks warehouses only. Other warehouse types are out of scope.

Workflow

Important: These steps are internal instructions for you. Do NOT expose step numbers, step names, or the procedural structure to the user. Just act naturally.

Step 1: Identify the warehouse

You need a warehouse to proceed.

If the user specified a warehouse (by name or UUID), use it.
If not: call analyze_storage_costs with no warehouse_id. The tool will either auto-pick when only one supported warehouse exists, or return a list of supported warehouses — let the user choose one, then call the tool again with the chosen warehouse_id.

Step 2: Run the analysis

Call analyze_storage_costs with:

warehouse_id: the warehouse UUID

The tool fetches candidates, classifies them into waste patterns (Unread, Write-only, Dead-end, Static waste, Zombie, Other stale) and table categories (Temporary, Archive/Snapshot, Production, Other), computes safety tiers, and returns a formatted analysis.

If the tool returns an error, report it to the user and stop.
If no candidates are found, tell the user and stop.

Step 3: Present the initial summary

The tool output contains two regions:

A  block with a condensed summary, a Top-N table, and a drill-down prompt.
A  block with per-category tables wrapped in  markers. Do NOT present these yet.

Present ONLY the  block — copy it verbatim, preserving every column, row, and value. Add a brief intro sentence if needed, then paste the block unchanged. The user will see the summary and top tables, then choose a category to drill into.

CRITICAL — do NOT call any other tool after analyze_storage_costs succeeds. No search, no get_table, no troubleshooting agents, no cross-checks. The analysis result IS the final answer; your only remaining job is to present the  block verbatim.

CRITICAL — preserve markdown-linked MCONs verbatim. The pre-formatted tables already contain properly linked MCONs (e.g., [`db:schema.table`](https://getmontecarlo.com/assets/MCON++...)). Never output bare MCON strings as plain text.

Step 4: Handle follow-up requests

Category drill-downs. When the user asks about a specific category ("show me temporary tables", "what about production?", "tell me more about archive"):

Find the matching  section in the analyze_storage_costs result already in the conversation. Do NOT re-invoke analyze_storage_costs — the data is already there.
Present that section's content verbatim — every column, row, and value.
After presenting, remind the user of remaining categories they haven't explored yet.

Category keywords (see references/output-structure.md for the full list):

"temporary", "staging", "tmp", "stg" → CATEGORY:temporary
"archive", "snapshot", "backup", "old" → CATEGORY:archive_snapshot
"uncategorized", "other", "unknown" → CATEGORY:other
"production", "prod", "critical", "important" → CATEGORY:production

If the user says "show me everything" or "all categories", present all category sections in order: temporary → archive → uncategorized → production.

Lineage checks. When the user asks what consumes a specific table ("check lineage for X", "is it safe to remove Y?", "what depends on this table?"):

Call get_asset_lineage with mcons: [<table mcon>] and direction: "DOWNSTREAM".
If has_relationships: false → the table's consumers are likely BI dashboards or tools (not other tables). Mention this — it may still be safe to remove, but the user should verify with dashboard owners.
If downstream tables exist AND are also stale → recommend removing both.
If downstream tables are active → flag as risky, do NOT recommend removal.

Note: The N consumers flag in the Usage & Risk column counts ALL consumers, including BI dashboards (Looker, Tableau, Power BI) and other non-table assets. The lineage tool only returns table-to-table edges, so lineage results may show fewer consumers than the count. When that happens, explain the gap to the user.

Reading the Usage & Risk column

Each row's final Usage & Risk cell combines read-side activity with risk flags. Format:

{activity}                          # no flags fire
{activity}; {flag1, flag2, ...}     # one or more flags fire

Activity values (always present):

No reads -- no recorded reads
180d · 0 reads -- last read N days ago, zero total reads
2d · 580 reads / 14 users -- recent reads, total reads and distinct reading users

A low days since read is only meaningful when paired with the read count — a single backup job or security scanner can make a cold table look "1d". Always weigh staleness against reads + users.

Risk flags (appended after ; in this fixed order when any fire):

high criticality / medium criticality -- pre-computed criticality
N consumers -- has active consumers (tables, views, or BI dashboards); verify before removing
high importance score -- is_important is a thresholded importance_score ≥ 0.6 computed upstream in Databricks, not a user-applied tag
has monitors -- actively monitored by Monte Carlo

Table categories

Tables are automatically classified for prioritized review:

Temporary/Staging -- Short-lived ETL/test tables (safest to drop)
Archive/Snapshot -- Historical copies, date-suffixed tables (verify retention policies)
Production -- Monitored, critical, or lineage-important tables (highest risk)
Other -- No strong signal either way (needs manual review)

Scope limitations

Storage costs only -- not compute, query optimization, or billing
One warehouse per analysis
Snowflake, BigQuery, Redshift, and Databricks only
Recommendations only -- never execute DROP TABLE or destructive actions

Capabilities

skillsource-monte-carlo-dataskill-storage-cost-analysistopic-agent-observabilitytopic-agent-skillstopic-ai-agentstopic-claude-codetopic-codex-skillstopic-cursortopic-data-observabilitytopic-data-qualitytopic-mcptopic-monte-carlotopic-opencodetopic-skill-md

Install

Installnpx skills add monte-carlo-data/mc-agent-toolkit

Sourcehttps://github.com/monte-carlo-data/mc-agent-toolkit/tree/main/skills/storage-cost-analysis

skills.shhttps://skills.sh/monte-carlo-data/mc-agent-toolkit/storage-cost-analysis

Transportskills-sh

Protocolskill

Quality

0.49/ 1.00

deterministic score 0.49 from registry signals: · indexed on github topic:agent-skills · 78 github stars · SKILL.md body (7,531 chars)

Provenance

Indexed fromgithub

Enriched2026-05-02 12:55:22Z · deterministic:skill-github:v1 · v1

First seen2026-04-18

Last seen2026-05-02

Agent access

JSONhttps://clawmart.sh/api/listings/CjhHnC