Skillquality 0.45

Catch silent agent regressions by diffing outputs and tool traces in CI with eval-view

Snapshot agent behavior, compare outputs and tool-call paths, and block releases when a model or prompt change quietly shifts behavior.

Price

free

Protocol

skill

Verified

Endpoint

https://skills.sh/agentskillexchange/skills/catch-silent-agent-regressions-by-diffing-outputs-and-tool-traces-in-ci-with-eval-view

What it does

Catch silent agent regressions by diffing outputs and tool traces in CI with eval-view

Snapshot agent behavior, compare outputs and tool-call paths, and block releases when a model or prompt change quietly shifts behavior.

Prerequisites

Python environment, eval-view installation, repeatable agent scenarios or tests, CI runner or local shell, supported agent stack under test

Installation

Basic usage or getting-started notes:

The loop closes: detection → investigation → graded verdict → quarantine governance → broadcast. You wake up, run progress, triage with drift, confirm with check --statistical, and the team sees the digest before...
| 📉 DRIFTING | Trend sliding with graded confidence (low/med/high) | Run evalview drift <test> |
| 🔎 INVESTIGATE | Verdict layer wants statistical replay | Run evalview check --statistical 5 |
Source: https://github.com/hidai25/eval-view
Extracted from upstream docs: https://raw.githubusercontent.com/hidai25/eval-view/HEAD/README.md

Documentation

https://github.com/hidai25/eval-view

Source

Agent Skill Exchange

Capabilities

skillsource-agentskillexchangeskill-catch-silent-agent-regressions-by-diffing-outputs-and-tool-traces-in-ci-with-eval-viewtopic-agent-skillstopic-ai-agentstopic-ai-toolstopic-awesome-listtopic-claude-codetopic-codextopic-cursortopic-llmtopic-mcptopic-npx-skillstopic-openclawtopic-skills-catalog

Install

Installnpx skills add agentskillexchange/skills

Sourcehttps://github.com/agentskillexchange/skills/tree/main/skills/catch-silent-agent-regressions-by-diffing-outputs-and-tool-traces-in-ci-with-eval-view

skills.shhttps://skills.sh/agentskillexchange/skills/catch-silent-agent-regressions-by-diffing-outputs-and-tool-traces-in-ci-with-eval-view

Transportskills-sh

Protocolskill

Quality

0.45/ 1.00

deterministic score 0.45 from registry signals: · indexed on github topic:agent-skills · 8 github stars · SKILL.md body (1,238 chars)

Provenance

Indexed fromgithub

Enriched2026-05-18 19:09:46Z · deterministic:skill-github:v1 · v1

First seen2026-05-18

Last seen2026-05-18

Agent access

JSONhttps://clawmart.sh/api/listings/VSU4cb