{"id":"ea250a41-576f-403d-b075-120d8a3242b2","shortId":"hnmhKK","kind":"skill","title":"Score model outputs with reusable evaluator prompts and metrics using autoevals","tagline":"Apply reusable evaluators to model outputs when you need lightweight scoring, rationale capture, or quick eval loops in code.","description":"# Score model outputs with reusable evaluator prompts and metrics using autoevals\n\nApply reusable evaluators to model outputs when you need lightweight scoring, rationale capture, or quick eval loops in code.\n\n## Prerequisites\n\nPython or Node.js, access to an OpenAI-compatible model endpoint or Braintrust proxy\n\n## Installation\n\nUse the upstream install or setup path that matches your environment:\n- npm install autoevals\n- pip install autoevals\n- npx braintrust run example.eval.js\n- To install the development dependencies, run make develop, and run source env.sh to activate the environment. Make a .env file from the .env.example file and set the environment variables. Run direnv allow to load the...\n\nRequirements and caveats from upstream:\n- Python 3.9 or higher\n- Compatible with both OpenAI Python SDK v0.x and v1.x\n- ### Python\n\nBasic usage or getting-started notes:\n- project but are implemented so you can flexibly run them on individual examples, tweak the prompts, and debug\n- </div>\n- <div className=\"tabs\">\n\n- Source: https://github.com/braintrustdata/autoevals\n- Extracted from upstream docs: https://raw.githubusercontent.com/braintrustdata/autoevals/HEAD/README.md\n\n## Documentation\n\n- https://github.com/braintrustdata/autoevals\n\n## Source\n\n- [Agent Skill Exchange](https://agentskillexchange.com/skills/score-model-outputs-with-reusable-evaluator-prompts-and-metrics-using-autoevals/)","tags":["score","model","outputs","with","reusable","evaluator","prompts","and","metrics","using","autoevals","skills"],"capabilities":["skill","source-agentskillexchange","skill-score-model-outputs-with-reusable-evaluator-prompts-and-metrics-using-autoevals","topic-agent-skills","topic-ai-agents","topic-ai-tools","topic-awesome-list","topic-claude-code","topic-codex","topic-cursor","topic-llm","topic-mcp","topic-npx-skills","topic-openclaw","topic-skills-catalog"],"categories":["skills"],"synonyms":[],"warnings":[],"endpointUrl":"https://skills.sh/agentskillexchange/skills/score-model-outputs-with-reusable-evaluator-prompts-and-metrics-using-autoevals","protocol":"skill","transport":"skills-sh","auth":{"type":"none","details":{"cli":"npx skills add agentskillexchange/skills","source_repo":"https://github.com/agentskillexchange/skills","install_from":"skills.sh"}},"qualityScore":"0.454","qualityRationale":"deterministic score 0.45 from registry signals: · indexed on github topic:agent-skills · 8 github stars · SKILL.md body (1,408 chars)","verified":false,"liveness":"unknown","lastLivenessCheck":null,"agentReviews":{"count":0,"score_avg":null,"cost_usd_avg":null,"success_rate":null,"latency_p50_ms":null,"narrative_summary":null,"summary_updated_at":null},"enrichmentModel":"deterministic:skill-github:v1","enrichmentVersion":1,"enrichedAt":"2026-05-18T19:12:19.594Z","embedding":null,"createdAt":"2026-05-18T13:19:11.032Z","updatedAt":"2026-05-18T19:12:19.594Z","lastSeenAt":"2026-05-18T19:12:19.594Z","tsv":"'/braintrustdata/autoevals':180,191 '/braintrustdata/autoevals/head/readme.md':187 '/skills/score-model-outputs-with-reusable-evaluator-prompts-and-metrics-using-autoevals/)':198 '3.9':139 'access':65 'activ':111 'agent':193 'agentskillexchange.com':197 'agentskillexchange.com/skills/score-model-outputs-with-reusable-evaluator-prompts-and-metrics-using-autoevals/)':196 'allow':129 'appli':12,42 'autoev':11,41,90,93 'basic':152 'braintrust':74,95 'captur':24,54 'caveat':135 'code':30,60 'compat':70,142 'debug':176 'depend':102 'develop':101,105 'direnv':128 'doc':184 'document':188 'endpoint':72 'env':116 'env.example':120 'env.sh':109 'environ':87,113,125 'eval':27,57 'evalu':6,14,36,44 'exampl':171 'example.eval.js':97 'exchang':195 'extract':181 'file':117,121 'flexibl':166 'get':156 'getting-start':155 'github.com':179,190 'github.com/braintrustdata/autoevals':178,189 'higher':141 'implement':162 'individu':170 'instal':76,80,89,92,99 'lightweight':21,51 'load':131 'loop':28,58 'make':104,114 'match':85 'metric':9,39 'model':2,16,32,46,71 'need':20,50 'node.js':64 'note':158 'npm':88 'npx':94 'openai':69,145 'openai-compat':68 'output':3,17,33,47 'path':83 'pip':91 'prerequisit':61 'project':159 'prompt':7,37,174 'proxi':75 'python':62,138,146,151 'quick':26,56 'rational':23,53 'raw.githubusercontent.com':186 'raw.githubusercontent.com/braintrustdata/autoevals/head/readme.md':185 'requir':133 'reusabl':5,13,35,43 'run':96,103,107,127,167 'score':1,22,31,52 'sdk':147 'set':123 'setup':82 'skill':194 'skill-score-model-outputs-with-reusable-evaluator-prompts-and-metrics-using-autoevals' 'sourc':108,177,192 'source-agentskillexchange' 'start':157 'topic-agent-skills' 'topic-ai-agents' 'topic-ai-tools' 'topic-awesome-list' 'topic-claude-code' 'topic-codex' 'topic-cursor' 'topic-llm' 'topic-mcp' 'topic-npx-skills' 'topic-openclaw' 'topic-skills-catalog' 'tweak':172 'upstream':79,137,183 'usag':153 'use':10,40,77 'v0.x':148 'v1.x':150 'variabl':126","prices":[{"id":"c0624fad-69ae-4461-ad48-9fdbeaee0552","listingId":"ea250a41-576f-403d-b075-120d8a3242b2","amountUsd":"0","unit":"free","nativeCurrency":null,"nativeAmount":null,"chain":null,"payTo":null,"paymentMethod":"skill-free","isPrimary":true,"details":{"org":"agentskillexchange","category":"skills","install_from":"skills.sh"},"createdAt":"2026-05-18T13:19:11.032Z"}],"sources":[{"listingId":"ea250a41-576f-403d-b075-120d8a3242b2","source":"github","sourceId":"agentskillexchange/skills/score-model-outputs-with-reusable-evaluator-prompts-and-metrics-using-autoevals","sourceUrl":"https://github.com/agentskillexchange/skills/tree/main/skills/score-model-outputs-with-reusable-evaluator-prompts-and-metrics-using-autoevals","isPrimary":false,"firstSeenAt":"2026-05-18T13:19:11.032Z","lastSeenAt":"2026-05-18T19:12:19.594Z"}],"details":{"listingId":"ea250a41-576f-403d-b075-120d8a3242b2","quickStartSnippet":null,"exampleRequest":null,"exampleResponse":null,"schema":null,"openapiUrl":null,"agentsTxtUrl":null,"citations":[],"useCases":[],"bestFor":[],"notFor":[],"kindDetails":{"org":"agentskillexchange","slug":"score-model-outputs-with-reusable-evaluator-prompts-and-metrics-using-autoevals","github":{"repo":"agentskillexchange/skills","stars":8,"topics":["agent-skills","ai-agents","ai-tools","awesome-list","claude-code","codex","cursor","llm","mcp","npx-skills","openclaw","skills-catalog"],"license":"mit","html_url":"https://github.com/agentskillexchange/skills","pushed_at":"2026-05-18T19:02:17Z","description":"The open catalog of AI agent skills — 2,000+ security-scanned skills for Claude Code, Cursor, Codex, and more.","skill_md_sha":"6fca721741816f2ff7ec1886fe921934498f9115","skill_md_path":"skills/score-model-outputs-with-reusable-evaluator-prompts-and-metrics-using-autoevals/SKILL.md","default_branch":"main","skill_tree_url":"https://github.com/agentskillexchange/skills/tree/main/skills/score-model-outputs-with-reusable-evaluator-prompts-and-metrics-using-autoevals"},"layout":"multi","source":"github","category":"skills","frontmatter":{"name":"Score model outputs with reusable evaluator prompts and metrics using autoevals","description":"Apply reusable evaluators to model outputs when you need lightweight scoring, rationale capture, or quick eval loops in code."},"skills_sh_url":"https://skills.sh/agentskillexchange/skills/score-model-outputs-with-reusable-evaluator-prompts-and-metrics-using-autoevals"},"updatedAt":"2026-05-18T19:12:19.594Z"}}