{"id":"7a72e1fb-b0ce-428d-a316-bc74ab396194","shortId":"USzVXB","kind":"skill","title":"Benchmark OpenClaw coding agents against repeatable real tasks before rollout with PinchBench","tagline":"Run a real-task benchmark suite against OpenClaw agents so model or harness changes can be compared before they hit production workflows.","description":"# Benchmark OpenClaw coding agents against repeatable real tasks before rollout with PinchBench\n\nRun a real-task benchmark suite against OpenClaw agents so model or harness changes can be compared before they hit production workflows.\n\n## Prerequisites\n\nRunning OpenClaw instance, Python 3.10+, uv, PinchBench repository checkout, model provider credentials as documented upstream\n\n## Installation\n\nUse the upstream install or setup path that matches your environment:\n- git clone https://github.com/pinchbench/skill.git\n\nRequirements and caveats from upstream:\n- **Note:** Model IDs must include their provider prefix (e.g. openrouter/, anthropic/). [OpenRouter](https://openrouter.ai) is the default provider used for routing.\n- Python 3.10+\n\nBasic usage or getting-started notes:\n- **Tool usage** — Can the model call the right tools with the right parameters?\n- bash\n- # Clone the skill\n\n- Source: https://github.com/pinchbench/skill\n- Extracted from upstream docs: https://raw.githubusercontent.com/pinchbench/skill/HEAD/README.md\n\n## Documentation\n\n- https://pinchbench.com\n\n## Source\n\n- [Agent Skill Exchange](https://agentskillexchange.com/skills/benchmark-openclaw-coding-agents-against-repeatable-real-tasks-before-rollout-with-pinchbench/)","tags":["benchmark","openclaw","coding","agents","against","repeatable","real","tasks","before","rollout","with","pinchbench"],"capabilities":["skill","source-agentskillexchange","skill-benchmark-openclaw-coding-agents-against-repeatable-real-tasks-before-rollout-with-pinchbench","topic-agent-skills","topic-ai-agents","topic-ai-tools","topic-awesome-list","topic-claude-code","topic-codex","topic-cursor","topic-llm","topic-mcp","topic-npx-skills","topic-openclaw","topic-skills-catalog"],"categories":["skills"],"synonyms":[],"warnings":[],"endpointUrl":"https://skills.sh/agentskillexchange/skills/benchmark-openclaw-coding-agents-against-repeatable-real-tasks-before-rollout-with-pinchbench","protocol":"skill","transport":"skills-sh","auth":{"type":"none","details":{"cli":"npx skills add agentskillexchange/skills","source_repo":"https://github.com/agentskillexchange/skills","install_from":"skills.sh"}},"qualityScore":"0.454","qualityRationale":"deterministic score 0.45 from registry signals: · indexed on github topic:agent-skills · 8 github stars · SKILL.md body (1,250 chars)","verified":false,"liveness":"unknown","lastLivenessCheck":null,"agentReviews":{"count":0,"score_avg":null,"cost_usd_avg":null,"success_rate":null,"latency_p50_ms":null,"narrative_summary":null,"summary_updated_at":null},"enrichmentModel":"deterministic:skill-github:v1","enrichmentVersion":1,"enrichedAt":"2026-05-18T19:09:37.074Z","embedding":null,"createdAt":"2026-05-18T13:15:24.435Z","updatedAt":"2026-05-18T19:09:37.074Z","lastSeenAt":"2026-05-18T19:09:37.074Z","tsv":"'/pinchbench/skill':158 '/pinchbench/skill.git':103 '/pinchbench/skill/head/readme.md':165 '/skills/benchmark-openclaw-coding-agents-against-repeatable-real-tasks-before-rollout-with-pinchbench/)':174 '3.10':76,130 'agent':4,22,39,57,169 'agentskillexchange.com':173 'agentskillexchange.com/skills/benchmark-openclaw-coding-agents-against-repeatable-real-tasks-before-rollout-with-pinchbench/)':172 'anthrop':119 'bash':151 'basic':131 'benchmark':1,18,36,53 'call':143 'caveat':106 'chang':27,62 'checkout':80 'clone':100,152 'code':3,38 'compar':30,65 'credenti':83 'default':124 'doc':162 'document':85,166 'e.g':117 'environ':98 'exchang':171 'extract':159 'get':135 'getting-start':134 'git':99 'github.com':102,157 'github.com/pinchbench/skill':156 'github.com/pinchbench/skill.git':101 'har':26,61 'hit':33,68 'id':111 'includ':113 'instal':87,91 'instanc':74 'match':96 'model':24,59,81,110,142 'must':112 'note':109,137 'openclaw':2,21,37,56,73 'openrout':118,120 'openrouter.ai':121 'paramet':150 'path':94 'pinchbench':12,47,78 'pinchbench.com':167 'prefix':116 'prerequisit':71 'product':34,69 'provid':82,115,125 'python':75,129 'raw.githubusercontent.com':164 'raw.githubusercontent.com/pinchbench/skill/head/readme.md':163 'real':7,16,42,51 'real-task':15,50 'repeat':6,41 'repositori':79 'requir':104 'right':145,149 'rollout':10,45 'rout':128 'run':13,48,72 'setup':93 'skill':154,170 'skill-benchmark-openclaw-coding-agents-against-repeatable-real-tasks-before-rollout-with-pinchbench' 'sourc':155,168 'source-agentskillexchange' 'start':136 'suit':19,54 'task':8,17,43,52 'tool':138,146 'topic-agent-skills' 'topic-ai-agents' 'topic-ai-tools' 'topic-awesome-list' 'topic-claude-code' 'topic-codex' 'topic-cursor' 'topic-llm' 'topic-mcp' 'topic-npx-skills' 'topic-openclaw' 'topic-skills-catalog' 'upstream':86,90,108,161 'usag':132,139 'use':88,126 'uv':77 'workflow':35,70","prices":[{"id":"705df22a-441f-43d3-bce3-4d7af6cb2900","listingId":"7a72e1fb-b0ce-428d-a316-bc74ab396194","amountUsd":"0","unit":"free","nativeCurrency":null,"nativeAmount":null,"chain":null,"payTo":null,"paymentMethod":"skill-free","isPrimary":true,"details":{"org":"agentskillexchange","category":"skills","install_from":"skills.sh"},"createdAt":"2026-05-18T13:15:24.435Z"}],"sources":[{"listingId":"7a72e1fb-b0ce-428d-a316-bc74ab396194","source":"github","sourceId":"agentskillexchange/skills/benchmark-openclaw-coding-agents-against-repeatable-real-tasks-before-rollout-with-pinchbench","sourceUrl":"https://github.com/agentskillexchange/skills/tree/main/skills/benchmark-openclaw-coding-agents-against-repeatable-real-tasks-before-rollout-with-pinchbench","isPrimary":false,"firstSeenAt":"2026-05-18T13:15:24.435Z","lastSeenAt":"2026-05-18T19:09:37.074Z"}],"details":{"listingId":"7a72e1fb-b0ce-428d-a316-bc74ab396194","quickStartSnippet":null,"exampleRequest":null,"exampleResponse":null,"schema":null,"openapiUrl":null,"agentsTxtUrl":null,"citations":[],"useCases":[],"bestFor":[],"notFor":[],"kindDetails":{"org":"agentskillexchange","slug":"benchmark-openclaw-coding-agents-against-repeatable-real-tasks-before-rollout-with-pinchbench","github":{"repo":"agentskillexchange/skills","stars":8,"topics":["agent-skills","ai-agents","ai-tools","awesome-list","claude-code","codex","cursor","llm","mcp","npx-skills","openclaw","skills-catalog"],"license":"mit","html_url":"https://github.com/agentskillexchange/skills","pushed_at":"2026-05-18T19:02:17Z","description":"The open catalog of AI agent skills — 2,000+ security-scanned skills for Claude Code, Cursor, Codex, and more.","skill_md_sha":"825fd6b02a284b530350006a874c9f51d3cc7229","skill_md_path":"skills/benchmark-openclaw-coding-agents-against-repeatable-real-tasks-before-rollout-with-pinchbench/SKILL.md","default_branch":"main","skill_tree_url":"https://github.com/agentskillexchange/skills/tree/main/skills/benchmark-openclaw-coding-agents-against-repeatable-real-tasks-before-rollout-with-pinchbench"},"layout":"multi","source":"github","category":"skills","frontmatter":{"name":"Benchmark OpenClaw coding agents against repeatable real tasks before rollout with PinchBench","description":"Run a real-task benchmark suite against OpenClaw agents so model or harness changes can be compared before they hit production workflows."},"skills_sh_url":"https://skills.sh/agentskillexchange/skills/benchmark-openclaw-coding-agents-against-repeatable-real-tasks-before-rollout-with-pinchbench"},"updatedAt":"2026-05-18T19:09:37.074Z"}}