{"id":"511a2044-9e72-44a8-bdf0-4d129a292dcf","shortId":"CzbXML","kind":"skill","title":"Benchmark browser agents on repeatable Playwright web tasks with Bananalyzer","tagline":"Run a repeatable evaluation suite for browser agents against static web task snapshots instead of judging them from demos or one-off tests.","description":"# Benchmark browser agents on repeatable Playwright web tasks with Bananalyzer\n\nRun a repeatable evaluation suite for browser agents against static web task snapshots instead of judging them from demos or one-off tests.\n\n## Prerequisites\n\nPython environment, Playwright browser runtime, pytest-based test execution, a custom AgentRunner implementation, example web task snapshots\n\n## Installation\n\nRequirements and caveats from upstream:\n- <img alt=\"Python\" src=\"https://img.shields.io/badge/python-3670A0?style=for-the-badge&logo=python&logoColor=ffdd54\" />\n- individual website. For an agent to best generalize, we require building a diverse dataset of websites across\n- In the future we will support more complex evaluation methods and examples that require multiple steps to complete. The\n\nBasic usage or getting-started notes:\n- Banana-lyzer is a CLI tool that runs a set of evaluations against a set of example websites.\n- The CLI tool will sequentially run examples against a user defined agent by dynamically constructing a pytest test suite\n- AgentRunner exposes the example, and a playwright browser context to use.\n\n- Source: https://github.com/reworkd/bananalyzer\n- Extracted from upstream docs: https://raw.githubusercontent.com/reworkd/bananalyzer/HEAD/README.md\n\n## Documentation\n\n- https://github.com/reworkd/bananalyzer\n\n## Source\n\n- [Agent Skill Exchange](https://agentskillexchange.com/skills/benchmark-browser-agents-on-repeatable-playwright-web-tasks-with-bananalyzer/)","tags":["benchmark","browser","agents","repeatable","playwright","web","tasks","with","bananalyzer","skills","agentskillexchange","agent-skills"],"capabilities":["skill","source-agentskillexchange","skill-benchmark-browser-agents-on-repeatable-playwright-web-tasks-with-bananalyzer","topic-agent-skills","topic-ai-agents","topic-ai-tools","topic-awesome-list","topic-claude-code","topic-codex","topic-cursor","topic-llm","topic-mcp","topic-npx-skills","topic-openclaw","topic-skills-catalog"],"categories":["skills"],"synonyms":[],"warnings":[],"endpointUrl":"https://skills.sh/agentskillexchange/skills/benchmark-browser-agents-on-repeatable-playwright-web-tasks-with-bananalyzer","protocol":"skill","transport":"skills-sh","auth":{"type":"none","details":{"cli":"npx skills add agentskillexchange/skills","source_repo":"https://github.com/agentskillexchange/skills","install_from":"skills.sh"}},"qualityScore":"0.454","qualityRationale":"deterministic score 0.45 from registry signals: · indexed on github topic:agent-skills · 8 github stars · SKILL.md body (1,490 chars)","verified":false,"liveness":"unknown","lastLivenessCheck":null,"agentReviews":{"count":0,"score_avg":null,"cost_usd_avg":null,"success_rate":null,"latency_p50_ms":null,"narrative_summary":null,"summary_updated_at":null},"enrichmentModel":"deterministic:skill-github:v1","enrichmentVersion":1,"enrichedAt":"2026-05-18T19:09:36.716Z","embedding":null,"createdAt":"2026-05-18T13:15:23.840Z","updatedAt":"2026-05-18T19:09:36.716Z","lastSeenAt":"2026-05-18T19:09:36.716Z","tsv":"'/reworkd/bananalyzer':189,200 '/reworkd/bananalyzer/head/readme.md':196 '/skills/benchmark-browser-agents-on-repeatable-playwright-web-tasks-with-bananalyzer/)':207 'across':110 'agent':3,18,37,52,98,167,202 'agentrunn':82,175 'agentskillexchange.com':206 'agentskillexchange.com/skills/benchmark-browser-agents-on-repeatable-playwright-web-tasks-with-bananalyzer/)':205 'banana':138 'banana-lyz':137 'bananalyz':10,44 'base':77 'basic':130 'benchmark':1,35 'best':100 'browser':2,17,36,51,73,182 'build':104 'caveat':91 'cli':142,157 'complet':128 'complex':118 'construct':170 'context':183 'custom':81 'dataset':107 'defin':166 'demo':29,63 'divers':106 'doc':193 'document':197 'dynam':169 'environ':71 'evalu':14,48,119,149 'exampl':84,122,154,162,178 'exchang':204 'execut':79 'expos':176 'extract':190 'futur':113 'general':101 'get':134 'getting-start':133 'github.com':188,199 'github.com/reworkd/bananalyzer':187,198 'implement':83 'individu':94 'instal':88 'instead':24,58 'judg':26,60 'lyzer':139 'method':120 'multipl':125 'note':136 'one':32,66 'one-off':31,65 'playwright':6,40,72,181 'prerequisit':69 'pytest':76,172 'pytest-bas':75 'python':70 'raw.githubusercontent.com':195 'raw.githubusercontent.com/reworkd/bananalyzer/head/readme.md':194 'repeat':5,13,39,47 'requir':89,103,124 'run':11,45,145,161 'runtim':74 'sequenti':160 'set':147,152 'skill':203 'skill-benchmark-browser-agents-on-repeatable-playwright-web-tasks-with-bananalyzer' 'snapshot':23,57,87 'sourc':186,201 'source-agentskillexchange' 'start':135 'static':20,54 'step':126 'suit':15,49,174 'support':116 'task':8,22,42,56,86 'test':34,68,78,173 'tool':143,158 'topic-agent-skills' 'topic-ai-agents' 'topic-ai-tools' 'topic-awesome-list' 'topic-claude-code' 'topic-codex' 'topic-cursor' 'topic-llm' 'topic-mcp' 'topic-npx-skills' 'topic-openclaw' 'topic-skills-catalog' 'upstream':93,192 'usag':131 'use':185 'user':165 'web':7,21,41,55,85 'websit':95,109,155","prices":[{"id":"64070534-a562-4987-99bc-39599c90bd95","listingId":"511a2044-9e72-44a8-bdf0-4d129a292dcf","amountUsd":"0","unit":"free","nativeCurrency":null,"nativeAmount":null,"chain":null,"payTo":null,"paymentMethod":"skill-free","isPrimary":true,"details":{"org":"agentskillexchange","category":"skills","install_from":"skills.sh"},"createdAt":"2026-05-18T13:15:23.840Z"}],"sources":[{"listingId":"511a2044-9e72-44a8-bdf0-4d129a292dcf","source":"github","sourceId":"agentskillexchange/skills/benchmark-browser-agents-on-repeatable-playwright-web-tasks-with-bananalyzer","sourceUrl":"https://github.com/agentskillexchange/skills/tree/main/skills/benchmark-browser-agents-on-repeatable-playwright-web-tasks-with-bananalyzer","isPrimary":false,"firstSeenAt":"2026-05-18T13:15:23.840Z","lastSeenAt":"2026-05-18T19:09:36.716Z"}],"details":{"listingId":"511a2044-9e72-44a8-bdf0-4d129a292dcf","quickStartSnippet":null,"exampleRequest":null,"exampleResponse":null,"schema":null,"openapiUrl":null,"agentsTxtUrl":null,"citations":[],"useCases":[],"bestFor":[],"notFor":[],"kindDetails":{"org":"agentskillexchange","slug":"benchmark-browser-agents-on-repeatable-playwright-web-tasks-with-bananalyzer","github":{"repo":"agentskillexchange/skills","stars":8,"topics":["agent-skills","ai-agents","ai-tools","awesome-list","claude-code","codex","cursor","llm","mcp","npx-skills","openclaw","skills-catalog"],"license":"mit","html_url":"https://github.com/agentskillexchange/skills","pushed_at":"2026-05-18T19:02:17Z","description":"The open catalog of AI agent skills — 2,000+ security-scanned skills for Claude Code, Cursor, Codex, and more.","skill_md_sha":"b957b1e2c15f22ef64bb8fb5a5f3eb7ec8df5cb7","skill_md_path":"skills/benchmark-browser-agents-on-repeatable-playwright-web-tasks-with-bananalyzer/SKILL.md","default_branch":"main","skill_tree_url":"https://github.com/agentskillexchange/skills/tree/main/skills/benchmark-browser-agents-on-repeatable-playwright-web-tasks-with-bananalyzer"},"layout":"multi","source":"github","category":"skills","frontmatter":{"name":"Benchmark browser agents on repeatable Playwright web tasks with Bananalyzer","description":"Run a repeatable evaluation suite for browser agents against static web task snapshots instead of judging them from demos or one-off tests."},"skills_sh_url":"https://skills.sh/agentskillexchange/skills/benchmark-browser-agents-on-repeatable-playwright-web-tasks-with-bananalyzer"},"updatedAt":"2026-05-18T19:09:36.716Z"}}