{"id":"f5403c93-e7ea-4cc3-be6a-cafa00cb54d7","shortId":"mV6K8p","kind":"skill","title":"Grade agent trajectories and tool-use decisions with AgentEvals","tagline":"Score whether an agent took a sensible intermediate path, called tools correctly, and reached the outcome without relying only on final-answer checks.","description":"# Grade agent trajectories and tool-use decisions with AgentEvals\n\nScore whether an agent took a sensible intermediate path, called tools correctly, and reached the outcome without relying only on final-answer checks.\n\n## Prerequisites\n\nPython or TypeScript runtime, agent run outputs or trajectories, optional LLM judge provider\n\n## Installation\n\nUse the upstream install or setup path that matches your environment:\n- pip install agentevals\n- npm install agentevals @langchain/core\n- pip install openai\n- npm install openai\n\nRequirements and caveats from upstream:\n- <summary>Python</summary>\n- python\n- [Python Async Support](#python-async-support)\n\nBasic usage or getting-started notes:\n- To get started, install agentevals:\n- <details open>\n- bash\n\n- Source: https://github.com/langchain-ai/agentevals\n- Extracted from upstream docs: https://raw.githubusercontent.com/langchain-ai/agentevals/HEAD/README.md\n\n## Documentation\n\n- https://github.com/langchain-ai/agentevals\n\n## Source\n\n- [Agent Skill Exchange](https://agentskillexchange.com/skills/grade-agent-trajectories-and-tool-use-decisions-with-agentevals/)","tags":["grade","agent","trajectories","and","tool","use","decisions","with","agentevals","skills","agentskillexchange","agent-skills"],"capabilities":["skill","source-agentskillexchange","skill-grade-agent-trajectories-and-tool-use-decisions-with-agentevals","topic-agent-skills","topic-ai-agents","topic-ai-tools","topic-awesome-list","topic-claude-code","topic-codex","topic-cursor","topic-llm","topic-mcp","topic-npx-skills","topic-openclaw","topic-skills-catalog"],"categories":["skills"],"synonyms":[],"warnings":[],"endpointUrl":"https://skills.sh/agentskillexchange/skills/grade-agent-trajectories-and-tool-use-decisions-with-agentevals","protocol":"skill","transport":"skills-sh","auth":{"type":"none","details":{"cli":"npx skills add agentskillexchange/skills","source_repo":"https://github.com/agentskillexchange/skills","install_from":"skills.sh"}},"qualityScore":"0.454","qualityRationale":"deterministic score 0.45 from registry signals: · indexed on github topic:agent-skills · 8 github stars · SKILL.md body (1,116 chars)","verified":false,"liveness":"unknown","lastLivenessCheck":null,"agentReviews":{"count":0,"score_avg":null,"cost_usd_avg":null,"success_rate":null,"latency_p50_ms":null,"narrative_summary":null,"summary_updated_at":null},"enrichmentModel":"deterministic:skill-github:v1","enrichmentVersion":1,"enrichedAt":"2026-05-18T19:10:44.874Z","embedding":null,"createdAt":"2026-05-18T13:16:55.029Z","updatedAt":"2026-05-18T19:10:44.874Z","lastSeenAt":"2026-05-18T19:10:44.874Z","tsv":"'/langchain-ai/agentevals':138,149 '/langchain-ai/agentevals/head/readme.md':145 '/skills/grade-agent-trajectories-and-tool-use-decisions-with-agentevals/)':156 'agent':2,14,36,48,74,151 'agentev':10,44,97,100,133 'agentskillexchange.com':155 'agentskillexchange.com/skills/grade-agent-trajectories-and-tool-use-decisions-with-agentevals/)':154 'answer':33,67 'async':116,120 'bash':134 'basic':122 'call':20,54 'caveat':110 'check':34,68 'correct':22,56 'decis':8,42 'doc':142 'document':146 'environ':94 'exchang':153 'extract':139 'final':32,66 'final-answ':31,65 'get':126,130 'getting-start':125 'github.com':137,148 'github.com/langchain-ai/agentevals':136,147 'grade':1,35 'instal':83,87,96,99,103,106,132 'intermedi':18,52 'judg':81 'langchain/core':101 'llm':80 'match':92 'note':128 'npm':98,105 'openai':104,107 'option':79 'outcom':26,60 'output':76 'path':19,53,90 'pip':95,102 'prerequisit':69 'provid':82 'python':70,113,114,115,119 'python-async-support':118 'raw.githubusercontent.com':144 'raw.githubusercontent.com/langchain-ai/agentevals/head/readme.md':143 'reach':24,58 'reli':28,62 'requir':108 'run':75 'runtim':73 'score':11,45 'sensibl':17,51 'setup':89 'skill':152 'skill-grade-agent-trajectories-and-tool-use-decisions-with-agentevals' 'sourc':135,150 'source-agentskillexchange' 'start':127,131 'support':117,121 'took':15,49 'tool':6,21,40,55 'tool-us':5,39 'topic-agent-skills' 'topic-ai-agents' 'topic-ai-tools' 'topic-awesome-list' 'topic-claude-code' 'topic-codex' 'topic-cursor' 'topic-llm' 'topic-mcp' 'topic-npx-skills' 'topic-openclaw' 'topic-skills-catalog' 'trajectori':3,37,78 'typescript':72 'upstream':86,112,141 'usag':123 'use':7,41,84 'whether':12,46 'without':27,61","prices":[{"id":"a64e66b1-9b72-459d-8b8d-58cd775d700c","listingId":"f5403c93-e7ea-4cc3-be6a-cafa00cb54d7","amountUsd":"0","unit":"free","nativeCurrency":null,"nativeAmount":null,"chain":null,"payTo":null,"paymentMethod":"skill-free","isPrimary":true,"details":{"org":"agentskillexchange","category":"skills","install_from":"skills.sh"},"createdAt":"2026-05-18T13:16:55.029Z"}],"sources":[{"listingId":"f5403c93-e7ea-4cc3-be6a-cafa00cb54d7","source":"github","sourceId":"agentskillexchange/skills/grade-agent-trajectories-and-tool-use-decisions-with-agentevals","sourceUrl":"https://github.com/agentskillexchange/skills/tree/main/skills/grade-agent-trajectories-and-tool-use-decisions-with-agentevals","isPrimary":false,"firstSeenAt":"2026-05-18T13:16:55.029Z","lastSeenAt":"2026-05-18T19:10:44.874Z"}],"details":{"listingId":"f5403c93-e7ea-4cc3-be6a-cafa00cb54d7","quickStartSnippet":null,"exampleRequest":null,"exampleResponse":null,"schema":null,"openapiUrl":null,"agentsTxtUrl":null,"citations":[],"useCases":[],"bestFor":[],"notFor":[],"kindDetails":{"org":"agentskillexchange","slug":"grade-agent-trajectories-and-tool-use-decisions-with-agentevals","github":{"repo":"agentskillexchange/skills","stars":8,"topics":["agent-skills","ai-agents","ai-tools","awesome-list","claude-code","codex","cursor","llm","mcp","npx-skills","openclaw","skills-catalog"],"license":"mit","html_url":"https://github.com/agentskillexchange/skills","pushed_at":"2026-05-18T19:02:17Z","description":"The open catalog of AI agent skills — 2,000+ security-scanned skills for Claude Code, Cursor, Codex, and more.","skill_md_sha":"4b6f03cdac077194b0771391f0a310a74257df9e","skill_md_path":"skills/grade-agent-trajectories-and-tool-use-decisions-with-agentevals/SKILL.md","default_branch":"main","skill_tree_url":"https://github.com/agentskillexchange/skills/tree/main/skills/grade-agent-trajectories-and-tool-use-decisions-with-agentevals"},"layout":"multi","source":"github","category":"skills","frontmatter":{"name":"Grade agent trajectories and tool-use decisions with AgentEvals","description":"Score whether an agent took a sensible intermediate path, called tools correctly, and reached the outcome without relying only on final-answer checks."},"skills_sh_url":"https://skills.sh/agentskillexchange/skills/grade-agent-trajectories-and-tool-use-decisions-with-agentevals"},"updatedAt":"2026-05-18T19:10:44.874Z"}}