Skillquality 0.45

Benchmark OpenClaw coding agents against repeatable real tasks before rollout with PinchBench

Run a real-task benchmark suite against OpenClaw agents so model or harness changes can be compared before they hit production workflows.

Price

free

Protocol

skill

Verified

Endpoint

https://skills.sh/agentskillexchange/skills/benchmark-openclaw-coding-agents-against-repeatable-real-tasks-before-rollout-with-pinchbench

What it does

Benchmark OpenClaw coding agents against repeatable real tasks before rollout with PinchBench

Run a real-task benchmark suite against OpenClaw agents so model or harness changes can be compared before they hit production workflows.

Prerequisites

Running OpenClaw instance, Python 3.10+, uv, PinchBench repository checkout, model provider credentials as documented upstream

Installation

Use the upstream install or setup path that matches your environment:

git clone https://github.com/pinchbench/skill.git

Requirements and caveats from upstream:

Note: Model IDs must include their provider prefix (e.g. openrouter/, anthropic/). OpenRouter is the default provider used for routing.
Python 3.10+

Basic usage or getting-started notes:

Tool usage — Can the model call the right tools with the right parameters?
bash
Clone the skill
Source: https://github.com/pinchbench/skill
Extracted from upstream docs: https://raw.githubusercontent.com/pinchbench/skill/HEAD/README.md

Documentation

https://pinchbench.com

Source

Agent Skill Exchange

Capabilities

skillsource-agentskillexchangeskill-benchmark-openclaw-coding-agents-against-repeatable-real-tasks-before-rollout-with-pinchbenchtopic-agent-skillstopic-ai-agentstopic-ai-toolstopic-awesome-listtopic-claude-codetopic-codextopic-cursortopic-llmtopic-mcptopic-npx-skillstopic-openclawtopic-skills-catalog

Install

Installnpx skills add agentskillexchange/skills

Sourcehttps://github.com/agentskillexchange/skills/tree/main/skills/benchmark-openclaw-coding-agents-against-repeatable-real-tasks-before-rollout-with-pinchbench

skills.shhttps://skills.sh/agentskillexchange/skills/benchmark-openclaw-coding-agents-against-repeatable-real-tasks-before-rollout-with-pinchbench

Transportskills-sh

Protocolskill

Quality

0.45/ 1.00

deterministic score 0.45 from registry signals: · indexed on github topic:agent-skills · 8 github stars · SKILL.md body (1,250 chars)

Provenance

Indexed fromgithub

Enriched2026-05-18 19:09:37Z · deterministic:skill-github:v1 · v1

First seen2026-05-18

Last seen2026-05-18

Agent access

JSONhttps://clawmart.sh/api/listings/USzVXB

What it does

Benchmark OpenClaw coding agents against repeatable real tasks before rollout with PinchBench

Prerequisites

Installation

Clone the skill

Documentation

Source

Capabilities

Install

Quality

Provenance

Agent access