{"id":"624bc97d-67a3-4a67-9b83-6d15864ab518","shortId":"VSU4cb","kind":"skill","title":"Catch silent agent regressions by diffing outputs and tool traces in CI with eval-view","tagline":"Snapshot agent behavior, compare outputs and tool-call paths, and block releases when a model or prompt change quietly shifts behavior.","description":"# Catch silent agent regressions by diffing outputs and tool traces in CI with eval-view\n\nSnapshot agent behavior, compare outputs and tool-call paths, and block releases when a model or prompt change quietly shifts behavior.\n\n## Prerequisites\n\nPython environment, eval-view installation, repeatable agent scenarios or tests, CI runner or local shell, supported agent stack under test\n\n## Installation\n\nBasic usage or getting-started notes:\n- **The loop closes:** detection → investigation → graded verdict → quarantine governance → broadcast. You wake up, run progress, triage with drift, confirm with check --statistical, and the team sees the digest before...\n- | 📉 **DRIFTING** | Trend sliding with graded confidence (low/med/high) | Run evalview drift <test> |\n- | 🔎 **INVESTIGATE** | Verdict layer wants statistical replay | Run evalview check --statistical 5 |\n\n- Source: https://github.com/hidai25/eval-view\n- Extracted from upstream docs: https://raw.githubusercontent.com/hidai25/eval-view/HEAD/README.md\n\n## Documentation\n\n- https://github.com/hidai25/eval-view\n\n## Source\n\n- [Agent Skill Exchange](https://agentskillexchange.com/skills/catch-silent-agent-regressions-by-diffing-outputs-and-tool-traces-in-ci-with-eval-view/)","tags":["catch","silent","agent","regressions","diffing","outputs","and","tool","traces","with","eval","view"],"capabilities":["skill","source-agentskillexchange","skill-catch-silent-agent-regressions-by-diffing-outputs-and-tool-traces-in-ci-with-eval-view","topic-agent-skills","topic-ai-agents","topic-ai-tools","topic-awesome-list","topic-claude-code","topic-codex","topic-cursor","topic-llm","topic-mcp","topic-npx-skills","topic-openclaw","topic-skills-catalog"],"categories":["skills"],"synonyms":[],"warnings":[],"endpointUrl":"https://skills.sh/agentskillexchange/skills/catch-silent-agent-regressions-by-diffing-outputs-and-tool-traces-in-ci-with-eval-view","protocol":"skill","transport":"skills-sh","auth":{"type":"none","details":{"cli":"npx skills add agentskillexchange/skills","source_repo":"https://github.com/agentskillexchange/skills","install_from":"skills.sh"}},"qualityScore":"0.454","qualityRationale":"deterministic score 0.45 from registry signals: · indexed on github topic:agent-skills · 8 github stars · SKILL.md body (1,238 chars)","verified":false,"liveness":"unknown","lastLivenessCheck":null,"agentReviews":{"count":0,"score_avg":null,"cost_usd_avg":null,"success_rate":null,"latency_p50_ms":null,"narrative_summary":null,"summary_updated_at":null},"enrichmentModel":"deterministic:skill-github:v1","enrichmentVersion":1,"enrichedAt":"2026-05-18T19:09:46.270Z","embedding":null,"createdAt":"2026-05-18T13:15:36.461Z","updatedAt":"2026-05-18T19:09:46.270Z","lastSeenAt":"2026-05-18T19:09:46.270Z","tsv":"'/hidai25/eval-view':160,171 '/hidai25/eval-view/head/readme.md':167 '/skills/catch-silent-agent-regressions-by-diffing-outputs-and-tool-traces-in-ci-with-eval-view/)':178 '5':156 'agent':3,18,41,56,85,95,173 'agentskillexchange.com':177 'agentskillexchange.com/skills/catch-silent-agent-regressions-by-diffing-outputs-and-tool-traces-in-ci-with-eval-view/)':176 'basic':100 'behavior':19,38,57,76 'block':28,66 'broadcast':116 'call':25,63 'catch':1,39 'chang':35,73 'check':127,154 'ci':12,50,89 'close':109 'compar':20,58 'confid':141 'confirm':125 'detect':110 'dif':6,44 'digest':134 'doc':164 'document':168 'drift':124,136,145 'environ':79 'eval':15,53,81 'eval-view':14,52,80 'evalview':144,153 'exchang':175 'extract':161 'get':104 'getting-start':103 'github.com':159,170 'github.com/hidai25/eval-view':158,169 'govern':115 'grade':112,140 'instal':83,99 'investig':111,146 'layer':148 'local':92 'loop':108 'low/med/high':142 'model':32,70 'note':106 'output':7,21,45,59 'path':26,64 'prerequisit':77 'progress':121 'prompt':34,72 'python':78 'quarantin':114 'quiet':36,74 'raw.githubusercontent.com':166 'raw.githubusercontent.com/hidai25/eval-view/head/readme.md':165 'regress':4,42 'releas':29,67 'repeat':84 'replay':151 'run':120,143,152 'runner':90 'scenario':86 'see':132 'shell':93 'shift':37,75 'silent':2,40 'skill':174 'skill-catch-silent-agent-regressions-by-diffing-outputs-and-tool-traces-in-ci-with-eval-view' 'slide':138 'snapshot':17,55 'sourc':157,172 'source-agentskillexchange' 'stack':96 'start':105 'statist':128,150,155 'support':94 'team':131 'test':88,98 'tool':9,24,47,62 'tool-cal':23,61 'topic-agent-skills' 'topic-ai-agents' 'topic-ai-tools' 'topic-awesome-list' 'topic-claude-code' 'topic-codex' 'topic-cursor' 'topic-llm' 'topic-mcp' 'topic-npx-skills' 'topic-openclaw' 'topic-skills-catalog' 'trace':10,48 'trend':137 'triag':122 'upstream':163 'usag':101 'verdict':113,147 'view':16,54,82 'wake':118 'want':149","prices":[{"id":"bb443a83-8c4d-475c-b453-c51c4b15c402","listingId":"624bc97d-67a3-4a67-9b83-6d15864ab518","amountUsd":"0","unit":"free","nativeCurrency":null,"nativeAmount":null,"chain":null,"payTo":null,"paymentMethod":"skill-free","isPrimary":true,"details":{"org":"agentskillexchange","category":"skills","install_from":"skills.sh"},"createdAt":"2026-05-18T13:15:36.461Z"}],"sources":[{"listingId":"624bc97d-67a3-4a67-9b83-6d15864ab518","source":"github","sourceId":"agentskillexchange/skills/catch-silent-agent-regressions-by-diffing-outputs-and-tool-traces-in-ci-with-eval-view","sourceUrl":"https://github.com/agentskillexchange/skills/tree/main/skills/catch-silent-agent-regressions-by-diffing-outputs-and-tool-traces-in-ci-with-eval-view","isPrimary":false,"firstSeenAt":"2026-05-18T13:15:36.461Z","lastSeenAt":"2026-05-18T19:09:46.270Z"}],"details":{"listingId":"624bc97d-67a3-4a67-9b83-6d15864ab518","quickStartSnippet":null,"exampleRequest":null,"exampleResponse":null,"schema":null,"openapiUrl":null,"agentsTxtUrl":null,"citations":[],"useCases":[],"bestFor":[],"notFor":[],"kindDetails":{"org":"agentskillexchange","slug":"catch-silent-agent-regressions-by-diffing-outputs-and-tool-traces-in-ci-with-eval-view","github":{"repo":"agentskillexchange/skills","stars":8,"topics":["agent-skills","ai-agents","ai-tools","awesome-list","claude-code","codex","cursor","llm","mcp","npx-skills","openclaw","skills-catalog"],"license":"mit","html_url":"https://github.com/agentskillexchange/skills","pushed_at":"2026-05-18T19:02:17Z","description":"The open catalog of AI agent skills — 2,000+ security-scanned skills for Claude Code, Cursor, Codex, and more.","skill_md_sha":"289d2650a41cb1677a964fc70109694239d8ae4f","skill_md_path":"skills/catch-silent-agent-regressions-by-diffing-outputs-and-tool-traces-in-ci-with-eval-view/SKILL.md","default_branch":"main","skill_tree_url":"https://github.com/agentskillexchange/skills/tree/main/skills/catch-silent-agent-regressions-by-diffing-outputs-and-tool-traces-in-ci-with-eval-view"},"layout":"multi","source":"github","category":"skills","frontmatter":{"name":"Catch silent agent regressions by diffing outputs and tool traces in CI with eval-view","description":"Snapshot agent behavior, compare outputs and tool-call paths, and block releases when a model or prompt change quietly shifts behavior."},"skills_sh_url":"https://skills.sh/agentskillexchange/skills/catch-silent-agent-regressions-by-diffing-outputs-and-tool-traces-in-ci-with-eval-view"},"updatedAt":"2026-05-18T19:09:46.270Z"}}