{"id":"1485c946-bfe7-48b0-89b1-1361dbff5a35","shortId":"G8uMnH","kind":"skill","title":"tune-monitor","tagline":"Analyze a Monte Carlo monitor and recommend config changes to reduce alert noise. Supports metric, custom SQL, validation, and table monitors. Fetches the report, identifies patterns, and suggests tuning.","description":"# Tune Monitor: Noise Reduction Analysis\n\nYou are a Monte Carlo monitor tuning agent. Your job is to fetch a monitor's report, dump it to\na file for reference, analyze the alert patterns, and recommend concrete configuration changes to\nreduce noise without sacrificing real signal.\n\n**Arguments:** $ARGUMENTS\n\nReference files live next to this skill file. **Use the Read tool** (not MCP resources) to access\nthem:\n\n- Metric monitor tuning: `references/metric-monitor.md` (relative to this file)\n- Custom SQL monitor tuning: `references/custom-sql-monitor.md` (relative to this file)\n- Validation monitor tuning: `references/validation-monitor.md` (relative to this file)\n- Table monitor tuning: `references/table-monitor.md` (relative to this file)\n\n---\n\n## Prerequisites\n\n- **Required:** Monte Carlo MCP server (`monte-carlo-mcp`) must be configured and authenticated\n\n---\n\n## Available MCP tools\n\n| Tool | Purpose |\n|---|---|\n| `get_monitor_report` | Fetch a monitor's alert history, incident details, and troubleshooting summaries |\n| `get_monitors` | Fetch monitor configuration (type, thresholds, schedule, segments) |\n| `create_metric_monitor` | Update a metric monitor's configuration (used in Phase 5) |\n| `create_custom_sql_monitor` | Update a custom SQL monitor's configuration (used in Phase 5) |\n| `create_validation_monitor` | Update a validation monitor's configuration (used in Phase 5) |\n| `tune_freshness_table_monitor` | Tune freshness sensitivity/threshold for a table (used in Phase 5) |\n| `tune_volume_change_table_monitor` | Tune volume change sensitivity/threshold for a table (used in Phase 5) |\n| `tune_unchanged_size_table_monitor` | Tune unchanged size sensitivity/threshold for a table (used in Phase 5) |\n\n---\n\n## Phase 0: Validate Input\n\nExtract the monitor UUID from `$ARGUMENTS`. It must be a valid UUID (format:\n`xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx`).\n\nIf no UUID is provided or it doesn't look like a UUID, stop and tell the user:\n\n> Please provide a monitor UUID. Example: `/tune-monitor 94c2dd3a-ef49-40f8-b1c1-741ba057cabf`\n\n---\n\n## Phase 1: Fetch Monitor Report\n\nCall `get_monitor_report` with:\n- `monitor_uuid`: the UUID from `$ARGUMENTS`\n- `max_incidents`: 50\n\nIf the tool returns an error or empty result, tell the user the monitor was not found and stop.\n\nAlso fetch the monitor's full config via `get_monitors` with:\n- `monitor_ids`: [`{monitor_uuid}`]\n- `include_fields`: [`config`]\n\nRun both calls in parallel.\n\n---\n\n## Phase 1.5: Determine Monitor Type and Load Reference\n\nFrom the `get_monitors` config response, determine the monitor type:\n\n| Config indicator | Type | Reference file |\n|---|---|---|\n| Monitor type is a metric monitor variant (e.g., metric, field health) | Metric | `references/metric-monitor.md` |\n| Monitor type is a custom SQL rule / custom monitor | Custom SQL | `references/custom-sql-monitor.md` |\n| Monitor type is a validation rule / validation monitor | Validation | `references/validation-monitor.md` |\n| Monitor type is a table monitor (freshness, volume, schema across tables) | Table | `references/table-monitor.md` |\n\n**Read** the appropriate reference file using the Read tool with the path relative to this skill\nfile. The reference contains type-specific config fields to extract, recommendation guidance, and\napply-changes instructions.\n\nIf the monitor type is not metric, custom SQL, validation, or table, stop and tell the user:\n\n> This skill supports tuning metric, custom SQL, validation, and table monitors. This monitor\n> is a {type} monitor, which is not supported.\n\n---\n\n## Phase 2: Analyze the Report\n\nAnalyze the monitor report and config together. Focus on:\n\n### 2a. Alert volume & frequency\n- How many incidents in the last 30 days? Last 7 days?\n- What is the firing cadence — multiple times per day? Daily? Sporadic?\n- Are incidents clustered in time (bursts) or spread evenly?\n\n### 2b. Anomaly patterns\n- Which segments (field values) are firing most? Are they the same segments repeatedly?\n- Are anomalies consistently marginal (just above threshold) or severe?\n- Are any anomalies from sparse/bursty event types that naturally spike?\n- Are anomalies caused by known operational events (deployments, batch jobs, bulk user actions)?\n- For validation monitors: how many invalid rows per incident? Is the count stable or growing?\n- For table monitors: which (table, metric) pairs are firing most? Are they the same repeatedly?\n\n### 2c. Current configuration\nExtract the current configuration. The specific fields to look for are documented in the per-type\nreference loaded in Phase 1.5. At minimum, extract:\n- Monitor type and what it measures\n- Schedule interval\n- Audiences / notification channels\n- Whether the monitor uses ML thresholds or explicit thresholds\n\n### 2d. Troubleshooting analysis (if available)\nLook at any troubleshooting TL;DRs in the report. Note:\n- Are most anomalies assessed as \"likely normal data variation\"?\n- Are there recurring root causes?\n- Is there a blind spot (e.g., no upstream metadata)?\n\n---\n\n## Phase 3: Generate Recommendations\n\nBased on the analysis, produce a prioritized list of recommendations. For each recommendation:\n- State the **problem** it solves\n- Give the **specific config change** (use exact field names from the MC config schema)\n- Explain the **trade-off** (what signal might be lost)\n\n### General recommendations (all monitor types)\n\n#### Sensitivity tuning (ML thresholds only)\nThis applies to any monitor that uses ML thresholds — both metric monitors and custom SQL monitors.\nSkip this section for validation monitors (they don't use ML thresholds), for table monitors\n(they have their own per-metric sensitivity — see the table monitor reference), and for monitors\nwith explicit thresholds (for custom SQL monitors, see threshold adjustment in the per-type\nreference instead).\n\n- If anomalies are consistently marginal (observed value just barely above threshold) AND assessed\n  as normal variation → recommend lowering sensitivity one step:\n  - If current sensitivity is `HIGH` → recommend `\"sensitivity\": \"medium\"`\n  - If current sensitivity is `MEDIUM` or `AUTO` → recommend `\"sensitivity\": \"low\"`\n- If current sensitivity is already `LOW` and still noisy → note this isn't a sensitivity issue\n\n#### Schedule / interval\n- If the monitor fires multiple times per day but anomalies always resolve within hours → recommend\n  increasing schedule interval (e.g., from 720 min to 1440 min) to reduce duplicate alerts\n- If anomalies are caused by data arriving late → recommend increasing `collection_lag`\n\n#### Snooze / training period\n- If the monitor was recently created (<30 days) and is still learning patterns → recommend\n  waiting for the model to stabilize before tuning\n\n#### Audience / notification routing\n- If the monitor has no audiences configured and is generating noise → recommend adding audiences\n  only for high-severity anomalies, or removing notifications entirely for known-noisy monitors\n\n### Type-specific recommendations\n\nFor type-specific recommendations (WHERE conditions, segment exclusion, aggregation changes,\nthreshold adjustment, SQL modifications, alert condition modifications, per-table-metric\nsensitivity tuning), follow the guidance in the per-type reference loaded in Phase 1.5.\n\n---\n\n## Phase 4: Present the Report\n\nOutput a structured analysis. **This is the primary output — include it in full.**\n\n```markdown\n## Monitor Tune Report: {monitor_uuid}\n\n**Monitor:** {display_name or mac_name}\n**Type:** {monitor type — metric, custom SQL, validation, or table}\n**Table:** {table}\n**What it monitors:** {metric and segments, SQL query summary, validation conditions, or table/metric coverage}\n**Current sensitivity:** {sensitivity or \"AUTO (default)\" or \"N/A (explicit thresholds)\"}\n**Schedule:** every {interval_minutes / 60}h\n\n### Alert Summary (last 30 days)\n- Total alerts: {count}\n- Firing frequency: {e.g., \"~twice daily\", \"daily\", \"sporadic\"}\n- Most noisy segments: {top 2-3 segment values by alert count, or N/A for custom SQL/validation}\n- Most noisy (table, metric) pairs: {for table monitors: top pairs by anomaly count}\n\n### Root Cause Pattern\n{1-3 sentence summary of what the alerts represent — operational events, bursty data, model\nmiscalibration, genuine issues, etc.}\n\n### Recommendations\n\n#### 1. {Highest-impact change} [RECOMMENDED]\n**Problem:** ...\n**Change:**\n```yaml\n{specific config field}: {new value}\n```\n**Trade-off:** ...\n\n#### 2. {Second change} [OPTIONAL]\n...\n\n#### 3. {Third change} [OPTIONAL]\n...\n\n### What NOT to change\n{Any configurations that look correct and should be left alone — avoid over-tuning.}\n\n### If these changes are made\n{Predict the expected outcome: estimated alert reduction, what genuine anomalies would still fire.}\n```\n\n**Next step:** \"Want me to apply any of these changes to the monitor config, or explore the alert\nhistory further?\"\n\n---\n\n## Phase 5: Apply Changes (if user requests)\n\nTo apply changes, follow the apply-changes instructions in the per-type reference loaded in\nPhase 1.5. Each reference specifies the correct tool and constraints for that monitor type.\n\nGeneral rules for all types:\n1. **Always preview first** — show the user what will change before applying.\n2. **Get explicit confirmation** before applying any change.\n\n---\n\n## Guidelines\n\n- **Be specific.** Generic advice like \"reduce sensitivity\" is less useful than exact config changes.\n- **Prefer surgical changes.** A targeted WHERE condition beats a blunt sensitivity reduction.\n- **Preserve signal.** Always explain what genuine anomalies would still be caught after tuning.\n- **Cite evidence.** Reference specific incident dates, segment values, and counts from the report.\n- **Degrade gracefully.** If troubleshooting runs are missing, note the limited context and\n  reason from alert patterns alone.","tags":["tune","monitor","agent","toolkit","monte-carlo-data","agent-observability","agent-skills","ai-agents","claude-code","codex-skills","cursor","data-observability"],"capabilities":["skill","source-monte-carlo-data","skill-tune-monitor","topic-agent-observability","topic-agent-skills","topic-ai-agents","topic-claude-code","topic-codex-skills","topic-cursor","topic-data-observability","topic-data-quality","topic-mcp","topic-monte-carlo","topic-opencode","topic-skill-md"],"categories":["mc-agent-toolkit"],"synonyms":[],"warnings":[],"endpointUrl":"https://skills.sh/monte-carlo-data/mc-agent-toolkit/tune-monitor","protocol":"skill","transport":"skills-sh","auth":{"type":"none","details":{"cli":"npx skills add monte-carlo-data/mc-agent-toolkit","source_repo":"https://github.com/monte-carlo-data/mc-agent-toolkit","install_from":"skills.sh"}},"qualityScore":"0.489","qualityRationale":"deterministic score 0.49 from registry signals: · indexed on github topic:agent-skills · 78 github stars · SKILL.md body (9,828 chars)","verified":false,"liveness":"unknown","lastLivenessCheck":null,"agentReviews":{"count":0,"score_avg":null,"cost_usd_avg":null,"success_rate":null,"latency_p50_ms":null,"narrative_summary":null,"summary_updated_at":null},"enrichmentModel":"deterministic:skill-github:v1","enrichmentVersion":1,"enrichedAt":"2026-05-02T12:55:22.357Z","embedding":null,"createdAt":"2026-04-18T22:12:56.680Z","updatedAt":"2026-05-02T12:55:22.357Z","lastSeenAt":"2026-05-02T12:55:22.357Z","tsv":"'-3':1139,1167 '/tune-monitor':308 '0':262 '1':316,1166,1185,1309 '1.5':377,670,1047,1291 '1440':932 '2':520,1138,1202,1321 '2a':533 '2b':568 '2c':646 '2d':694 '3':733,1206 '30':543,959,1122 '4':1049 '40f8':312 '5':186,201,214,228,244,260,1267 '50':333 '60':1117 '7':546 '720':929 '741ba057cabf':314 '94c2dd3a':310 '94c2dd3a-ef49-40f8-b1c1-741ba057cabf':309 'access':96 'across':443 'action':615 'ad':990 'adjust':844,1023 'advic':1333 'agent':45 'aggreg':1020 'alert':15,64,158,534,937,1026,1119,1125,1143,1173,1238,1263,1396 'alon':1223,1398 'alreadi':895 'also':353 'alway':919,1310,1358 'analysi':37,696,739,1056 'analyz':4,62,521,524 'anomali':569,585,595,604,711,853,918,939,997,1161,1242,1362 'appli':478,789,1251,1268,1274,1279,1320,1326 'apply-chang':477,1278 'appropri':449 'argument':78,79,270,330 'arriv':944 'assess':712,864 'audienc':682,975,983,991 'authent':145 'auto':887,1107 'avail':146,698 'avoid':1224 'b1c1':313 'bare':860 'base':736 'batch':611 'beat':1351 'blind':726 'blunt':1353 'bulk':613 'burst':564 'bursti':1177 'cadenc':552 'call':320,373 'carlo':7,42,134,139 'caught':1366 'caus':605,722,941,1164 'chang':12,70,231,236,479,758,1021,1189,1192,1204,1208,1213,1230,1255,1269,1275,1280,1318,1328,1343,1346 'channel':684 'cite':1369 'cluster':561 'collect':948 'concret':68 'condit':1017,1027,1099,1350 'config':11,359,370,388,394,470,529,757,766,1195,1259,1342 'configur':69,143,169,182,197,210,648,652,984,1215 'confirm':1324 'consist':586,855 'constraint':1299 'contain':466 'context':1392 'correct':1218,1296 'count':627,1126,1144,1162,1378 'coverag':1102 'creat':174,187,202,958 'current':647,651,874,882,892,1103 'custom':19,106,188,193,416,419,421,488,503,801,839,1082,1148 'daili':557,1131,1132 'data':716,943,1178 'date':1374 'day':544,547,556,916,960,1123 'default':1108 'degrad':1382 'deploy':610 'detail':161 'determin':378,390 'display':1073 'document':660 'doesn':291 'drs':704 'dump':55 'duplic':936 'e.g':406,728,927,1129 'ef49':311 'empti':341 'entir':1001 'error':339 'estim':1237 'etc':1183 'even':567 'event':598,609,1176 'everi':1114 'evid':1370 'exact':760,1341 'exampl':307 'exclus':1019 'expect':1235 'explain':768,1359 'explicit':692,836,1111,1323 'explor':1261 'extract':265,473,649,673 'fetch':25,50,154,167,317,354 'field':369,408,471,573,655,761,1196 'file':59,81,87,105,114,122,130,398,451,463 'fire':551,576,639,912,1127,1245 'first':1312 'focus':531 'follow':1035,1276 'format':277 'found':350 'frequenc':536,1128 'fresh':216,220,440 'full':358,1065 'general':778,1304 'generat':734,987 'generic':1332 'genuin':1181,1241,1361 'get':151,165,321,361,386,1322 'give':754 'grace':1383 'grow':630 'guidanc':475,1037 'guidelin':1329 'h':1118 'health':409 'high':877,995 'high-sever':994 'highest':1187 'highest-impact':1186 'histori':159,1264 'hour':922 'id':365 'identifi':28 'impact':1188 'incid':160,332,539,560,624,1373 'includ':368,1062 'increas':924,947 'indic':395 'input':264 'instead':851 'instruct':480,1281 'interv':681,908,926,1115 'invalid':621 'isn':902 'issu':906,1182 'job':47,612 'known':607,1004 'known-noisi':1003 'lag':949 'last':542,545,1121 'late':945 'learn':964 'left':1222 'less':1338 'like':294,714,1334 'limit':1391 'list':743 'live':82 'load':382,667,1044,1288 'look':293,657,699,1217 'lost':777 'low':890,896 'lower':869 'mac':1076 'made':1232 'mani':538,620 'margin':587,856 'markdown':1066 'max':331 'mc':765 'mcp':93,135,140,147 'measur':679 'medium':880,885 'metadata':731 'metric':18,98,175,179,403,407,410,487,502,636,798,825,1032,1081,1092,1153 'might':775 'min':930,933 'minimum':672 'minut':1116 'miscalibr':1180 'miss':1388 'ml':689,785,795,814 'model':970,1179 'modif':1025,1028 'monitor':3,8,24,34,43,52,99,108,116,124,152,156,166,168,176,180,190,195,204,208,218,233,249,267,305,318,322,325,347,356,362,364,366,379,387,392,399,404,412,420,424,431,434,439,483,508,510,514,526,618,633,674,687,781,792,799,803,809,818,830,834,841,911,955,980,1006,1067,1070,1072,1079,1091,1157,1258,1302 'mont':6,41,133,138 'monte-carlo-mcp':137 'multipl':553,913 'must':141,272 'n/a':1110,1146 'name':762,1074,1077 'natur':601 'new':1197 'next':83,1246 'nois':16,35,73,988 'noisi':899,1005,1135,1151 'normal':715,866 'note':708,900,1389 'notif':683,976,1000 'observ':857 'one':871 'oper':608,1175 'option':1205,1209 'outcom':1236 'output':1053,1061 'over-tun':1225 'pair':637,1154,1159 'parallel':375 'path':458 'pattern':29,65,570,965,1165,1397 'per':555,623,664,824,848,915,1030,1041,1285 'per-metr':823 'per-table-metr':1029 'per-typ':663,847,1040,1284 'period':952 'phase':185,200,213,227,243,259,261,315,376,519,669,732,1046,1048,1266,1290 'pleas':302 'predict':1233 'prefer':1344 'prerequisit':131 'present':1050 'preserv':1356 'preview':1311 'primari':1060 'priorit':742 'problem':751,1191 'produc':740 'provid':288,303 'purpos':150 'queri':1096 'read':90,447,454 'real':76 'reason':1394 'recent':957 'recommend':10,67,474,735,745,748,779,868,878,888,923,946,966,989,1010,1015,1184,1190 'recur':720 'reduc':14,72,935,1335 'reduct':36,1239,1355 'refer':61,80,383,397,450,465,666,831,850,1043,1287,1293,1371 'references/custom-sql-monitor.md':110,423 'references/metric-monitor.md':101,411 'references/table-monitor.md':126,446 'references/validation-monitor.md':118,433 'relat':102,111,119,127,459 'remov':999 'repeat':583,645 'report':27,54,153,319,323,523,527,707,1052,1069,1381 'repres':1174 'request':1272 'requir':132 'resolv':920 'resourc':94 'respons':389 'result':342 'return':337 'root':721,1163 'rout':977 'row':622 'rule':418,429,1305 'run':371,1386 'sacrif':75 'schedul':172,680,907,925,1113 'schema':442,767 'second':1203 'section':806 'see':827,842 'segment':173,572,582,1018,1094,1136,1140,1375 'sensit':783,826,870,875,879,883,889,893,905,1033,1104,1105,1336,1354 'sensitivity/threshold':221,237,253 'sentenc':1168 'server':136 'sever':592,996 'show':1313 'signal':77,774,1357 'size':247,252 'skill':86,462,499 'skill-tune-monitor' 'skip':804 'snooz':950 'solv':753 'source-monte-carlo-data' 'sparse/bursty':597 'specif':469,654,756,1009,1014,1194,1331,1372 'specifi':1294 'spike':602 'sporad':558,1133 'spot':727 'spread':566 'sql':20,107,189,194,417,422,489,504,802,840,1024,1083,1095 'sql/validation':1149 'stabil':972 'stabl':628 'state':749 'step':872,1247 'still':898,963,1244,1364 'stop':297,352,493 'structur':1055 'suggest':31 'summari':164,1097,1120,1169 'support':17,500,518 'surgic':1345 'tabl':23,123,217,224,232,240,248,256,438,444,445,492,507,632,635,817,829,1031,1086,1087,1088,1152,1156 'table/metric':1101 'target':1348 'tell':299,343,495 'third':1207 'threshold':171,590,690,693,786,796,815,837,843,862,1022,1112 'time':554,563,914 'tl':703 'togeth':530 'tool':91,148,149,336,455,1297 'top':1137,1158 'topic-agent-observability' 'topic-agent-skills' 'topic-ai-agents' 'topic-claude-code' 'topic-codex-skills' 'topic-cursor' 'topic-data-observability' 'topic-data-quality' 'topic-mcp' 'topic-monte-carlo' 'topic-opencode' 'topic-skill-md' 'total':1124 'trade':771,1200 'trade-off':770,1199 'train':951 'troubleshoot':163,695,702,1385 'tune':2,32,33,44,100,109,117,125,215,219,229,234,245,250,501,784,974,1034,1068,1227,1368 'tune-monitor':1 'twice':1130 'type':170,380,393,396,400,413,425,435,468,484,513,599,665,675,782,849,1008,1013,1042,1078,1080,1286,1303,1308 'type-specif':467,1007,1012 'unchang':246,251 'updat':177,191,205 'upstream':730 'use':88,183,198,211,225,241,257,452,688,759,794,813,1339 'user':301,345,497,614,1271,1315 'uuid':268,276,286,296,306,326,328,367,1071 'valid':21,115,203,207,263,275,428,430,432,490,505,617,808,1084,1098 'valu':574,858,1141,1198,1376 'variant':405 'variat':717,867 'via':360 'volum':230,235,441,535 'wait':967 'want':1248 'whether':685 'within':921 'without':74 'would':1243,1363 'xxxx':280,281,282 'xxxxxxxx':279 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx':278 'xxxxxxxxxxxx':283 'yaml':1193","prices":[{"id":"0f1b1ce1-2050-4f47-b032-e229309daff9","listingId":"1485c946-bfe7-48b0-89b1-1361dbff5a35","amountUsd":"0","unit":"free","nativeCurrency":null,"nativeAmount":null,"chain":null,"payTo":null,"paymentMethod":"skill-free","isPrimary":true,"details":{"org":"monte-carlo-data","category":"mc-agent-toolkit","install_from":"skills.sh"},"createdAt":"2026-04-18T22:12:56.680Z"}],"sources":[{"listingId":"1485c946-bfe7-48b0-89b1-1361dbff5a35","source":"github","sourceId":"monte-carlo-data/mc-agent-toolkit/tune-monitor","sourceUrl":"https://github.com/monte-carlo-data/mc-agent-toolkit/tree/main/skills/tune-monitor","isPrimary":false,"firstSeenAt":"2026-04-18T22:12:56.680Z","lastSeenAt":"2026-05-02T12:55:22.357Z"}],"details":{"listingId":"1485c946-bfe7-48b0-89b1-1361dbff5a35","quickStartSnippet":null,"exampleRequest":null,"exampleResponse":null,"schema":null,"openapiUrl":null,"agentsTxtUrl":null,"citations":[],"useCases":[],"bestFor":[],"notFor":[],"kindDetails":{"org":"monte-carlo-data","slug":"tune-monitor","github":{"repo":"monte-carlo-data/mc-agent-toolkit","stars":78,"topics":["agent-observability","agent-skills","ai-agents","claude-code","codex-skills","cursor","data-observability","data-quality","mcp","monte-carlo","opencode","skill-md","skillsmp","vscode"],"license":"apache-2.0","html_url":"https://github.com/monte-carlo-data/mc-agent-toolkit","pushed_at":"2026-04-30T23:25:43Z","description":"Official Monte Carlo toolkit for AI coding agents. Skills and plugins that bring data and agent observability — monitoring, triaging, troubleshooting, health checks  — into Claude Code, Cursor, and more.","skill_md_sha":"92aa8ece1f477e7bd7fc5a5808b29badda042368","skill_md_path":"skills/tune-monitor/SKILL.md","default_branch":"main","skill_tree_url":"https://github.com/monte-carlo-data/mc-agent-toolkit/tree/main/skills/tune-monitor"},"layout":"multi","source":"github","category":"mc-agent-toolkit","frontmatter":{"name":"tune-monitor","description":"Analyze a Monte Carlo monitor and recommend config changes to reduce alert noise. Supports metric, custom SQL, validation, and table monitors. Fetches the report, identifies patterns, and suggests tuning."},"skills_sh_url":"https://skills.sh/monte-carlo-data/mc-agent-toolkit/tune-monitor"},"updatedAt":"2026-05-02T12:55:22.357Z"}}