Skillquality 0.45

Run repeatable agent evaluation suites with trajectory and simulator coverage using Strands Evals

Build repeatable evaluation experiments for agents and LLM apps with output checks, trajectory scoring, simulators, and trace-based review.

Price

free

Protocol

skill

Verified

Endpoint

https://skills.sh/agentskillexchange/skills/run-repeatable-agent-evaluation-suites-with-trajectory-and-simulator-coverage-using-strands-evals

What it does

Run repeatable agent evaluation suites with trajectory and simulator coverage using Strands Evals

Build repeatable evaluation experiments for agents and LLM apps with output checks, trajectory scoring, simulators, and trace-based review.

Prerequisites

Python 3.10+, pip, optional judge-model access

Installation

Use the upstream install or setup path that matches your environment:

pip install strands-agents-evals
pip install -e .
pip install -e ".[test]"
pip install -e ".[test,dev]"

Requirements and caveats from upstream:

<a href="https://python.org"><img alt="Python versions" src="https://img.shields.io/pypi/pyversions/strands-agents-evals"/></a>
◆ <a href="https://github.com/strands-agents/sdk-python">Python SDK</a>
python

Basic usage or getting-started notes:

Multiple Evaluation Types: Output evaluation, trajectory analysis, tool usage assessment, and interaction evaluation
bash
from strands import Agent
Source: https://github.com/strands-agents/evals
Extracted from upstream docs: https://raw.githubusercontent.com/strands-agents/evals/HEAD/README.md

Documentation

https://github.com/strands-agents/evals

Source

Agent Skill Exchange

Capabilities

skillsource-agentskillexchangeskill-run-repeatable-agent-evaluation-suites-with-trajectory-and-simulator-coverage-using-strands-evalstopic-agent-skillstopic-ai-agentstopic-ai-toolstopic-awesome-listtopic-claude-codetopic-codextopic-cursortopic-llmtopic-mcptopic-npx-skillstopic-openclawtopic-skills-catalog

Install

Installnpx skills add agentskillexchange/skills

Sourcehttps://github.com/agentskillexchange/skills/tree/main/skills/run-repeatable-agent-evaluation-suites-with-trajectory-and-simulator-coverage-using-strands-evals

skills.shhttps://skills.sh/agentskillexchange/skills/run-repeatable-agent-evaluation-suites-with-trajectory-and-simulator-coverage-using-strands-evals

Transportskills-sh

Protocolskill

Quality

0.45/ 1.00

deterministic score 0.45 from registry signals: · indexed on github topic:agent-skills · 8 github stars · SKILL.md body (1,346 chars)

Provenance

Indexed fromgithub

Enriched2026-05-18 19:12:13Z · deterministic:skill-github:v1 · v1

First seen2026-05-18

Last seen2026-05-18

Agent access

JSONhttps://clawmart.sh/api/listings/TfM8Ua