{"id":"71bb8979-9787-43fa-aa3a-fbb6ff6b6756","shortId":"DYg4Yc","kind":"skill","title":"Benchmark browser agents on a fixed stealth and task suite with browser-use benchmark","tagline":"Compare browser-agent reliability on a repeatable task and anti-bot suite before choosing a stack or claiming progress.","description":"# Benchmark browser agents on a fixed stealth and task suite with browser-use benchmark\n\nCompare browser-agent reliability on a repeatable task and anti-bot suite before choosing a stack or claiming progress.\n\n## Prerequisites\n\nPython, uv, benchmark repository dependencies, required API keys for the judge model and selected browser provider, target browser agent configuration\n\n## Installation\n\nUse the upstream install or setup path that matches your environment:\n- pip install uv\n- uv sync\n- uv run python run_eval.py --browser <provider>\n\nRequirements and caveats from upstream:\n- python -c \"\n\nBasic usage or getting-started notes:\n- **2. Set up your .env** (see [.env.example](.env.example))\n- cp .env.example .env\n- **4. Run the evaluation**\n\n- Source: https://github.com/browser-use/benchmark\n- Extracted from upstream docs: https://raw.githubusercontent.com/browser-use/benchmark/HEAD/README.md\n\n## Documentation\n\n- https://github.com/browser-use/benchmark#readme\n\n## Source\n\n- [Agent Skill Exchange](https://agentskillexchange.com/skills/benchmark-browser-agents-on-a-fixed-stealth-and-task-suite-with-browser-use-benchmark/)","tags":["benchmark","browser","agents","fixed","stealth","and","task","suite","with","use","skills","agentskillexchange"],"capabilities":["skill","source-agentskillexchange","skill-benchmark-browser-agents-on-a-fixed-stealth-and-task-suite-with-browser-use-benchmark","topic-agent-skills","topic-ai-agents","topic-ai-tools","topic-awesome-list","topic-claude-code","topic-codex","topic-cursor","topic-llm","topic-mcp","topic-npx-skills","topic-openclaw","topic-skills-catalog"],"categories":["skills"],"synonyms":[],"warnings":[],"endpointUrl":"https://skills.sh/agentskillexchange/skills/benchmark-browser-agents-on-a-fixed-stealth-and-task-suite-with-browser-use-benchmark","protocol":"skill","transport":"skills-sh","auth":{"type":"none","details":{"cli":"npx skills add agentskillexchange/skills","source_repo":"https://github.com/agentskillexchange/skills","install_from":"skills.sh"}},"qualityScore":"0.454","qualityRationale":"deterministic score 0.45 from registry signals: · indexed on github topic:agent-skills · 8 github stars · SKILL.md body (1,135 chars)","verified":false,"liveness":"unknown","lastLivenessCheck":null,"agentReviews":{"count":0,"score_avg":null,"cost_usd_avg":null,"success_rate":null,"latency_p50_ms":null,"narrative_summary":null,"summary_updated_at":null},"enrichmentModel":"deterministic:skill-github:v1","enrichmentVersion":1,"enrichedAt":"2026-05-18T19:09:36.633Z","embedding":null,"createdAt":"2026-05-18T13:15:23.680Z","updatedAt":"2026-05-18T19:09:36.633Z","lastSeenAt":"2026-05-18T19:09:36.633Z","tsv":"'/browser-use/benchmark':148 '/browser-use/benchmark#readme':159 '/browser-use/benchmark/head/readme.md':155 '/skills/benchmark-browser-agents-on-a-fixed-stealth-and-task-suite-with-browser-use-benchmark/)':166 '2':130 '4':141 'agent':3,19,39,55,92,161 'agentskillexchange.com':165 'agentskillexchange.com/skills/benchmark-browser-agents-on-a-fixed-stealth-and-task-suite-with-browser-use-benchmark/)':164 'anti':27,63 'anti-bot':26,62 'api':80 'basic':123 'benchmark':1,15,37,51,76 'bot':28,64 'browser':2,13,18,38,49,54,88,91,115 'browser-ag':17,53 'browser-us':12,48 'c':122 'caveat':118 'choos':31,67 'claim':35,71 'compar':16,52 'configur':93 'cp':138 'depend':78 'doc':152 'document':156 'env':134,140 'env.example':136,137,139 'environ':105 'evalu':144 'exchang':163 'extract':149 'fix':6,42 'get':127 'getting-start':126 'github.com':147,158 'github.com/browser-use/benchmark':146 'github.com/browser-use/benchmark#readme':157 'instal':94,98,107 'judg':84 'key':81 'match':103 'model':85 'note':129 'path':101 'pip':106 'prerequisit':73 'progress':36,72 'provid':89 'python':74,113,121 'raw.githubusercontent.com':154 'raw.githubusercontent.com/browser-use/benchmark/head/readme.md':153 'reliabl':20,56 'repeat':23,59 'repositori':77 'requir':79,116 'run':112,142 'run_eval.py':114 'see':135 'select':87 'set':131 'setup':100 'skill':162 'skill-benchmark-browser-agents-on-a-fixed-stealth-and-task-suite-with-browser-use-benchmark' 'sourc':145,160 'source-agentskillexchange' 'stack':33,69 'start':128 'stealth':7,43 'suit':10,29,46,65 'sync':110 'target':90 'task':9,24,45,60 'topic-agent-skills' 'topic-ai-agents' 'topic-ai-tools' 'topic-awesome-list' 'topic-claude-code' 'topic-codex' 'topic-cursor' 'topic-llm' 'topic-mcp' 'topic-npx-skills' 'topic-openclaw' 'topic-skills-catalog' 'upstream':97,120,151 'usag':124 'use':14,50,95 'uv':75,108,109,111","prices":[{"id":"3eca582c-4b24-4ff2-ad67-47ac6fc23cab","listingId":"71bb8979-9787-43fa-aa3a-fbb6ff6b6756","amountUsd":"0","unit":"free","nativeCurrency":null,"nativeAmount":null,"chain":null,"payTo":null,"paymentMethod":"skill-free","isPrimary":true,"details":{"org":"agentskillexchange","category":"skills","install_from":"skills.sh"},"createdAt":"2026-05-18T13:15:23.680Z"}],"sources":[{"listingId":"71bb8979-9787-43fa-aa3a-fbb6ff6b6756","source":"github","sourceId":"agentskillexchange/skills/benchmark-browser-agents-on-a-fixed-stealth-and-task-suite-with-browser-use-benchmark","sourceUrl":"https://github.com/agentskillexchange/skills/tree/main/skills/benchmark-browser-agents-on-a-fixed-stealth-and-task-suite-with-browser-use-benchmark","isPrimary":false,"firstSeenAt":"2026-05-18T13:15:23.680Z","lastSeenAt":"2026-05-18T19:09:36.633Z"}],"details":{"listingId":"71bb8979-9787-43fa-aa3a-fbb6ff6b6756","quickStartSnippet":null,"exampleRequest":null,"exampleResponse":null,"schema":null,"openapiUrl":null,"agentsTxtUrl":null,"citations":[],"useCases":[],"bestFor":[],"notFor":[],"kindDetails":{"org":"agentskillexchange","slug":"benchmark-browser-agents-on-a-fixed-stealth-and-task-suite-with-browser-use-benchmark","github":{"repo":"agentskillexchange/skills","stars":8,"topics":["agent-skills","ai-agents","ai-tools","awesome-list","claude-code","codex","cursor","llm","mcp","npx-skills","openclaw","skills-catalog"],"license":"mit","html_url":"https://github.com/agentskillexchange/skills","pushed_at":"2026-05-18T19:02:17Z","description":"The open catalog of AI agent skills — 2,000+ security-scanned skills for Claude Code, Cursor, Codex, and more.","skill_md_sha":"b94eb8edb07e1a107a4e925f780249f6675b0d96","skill_md_path":"skills/benchmark-browser-agents-on-a-fixed-stealth-and-task-suite-with-browser-use-benchmark/SKILL.md","default_branch":"main","skill_tree_url":"https://github.com/agentskillexchange/skills/tree/main/skills/benchmark-browser-agents-on-a-fixed-stealth-and-task-suite-with-browser-use-benchmark"},"layout":"multi","source":"github","category":"skills","frontmatter":{"name":"Benchmark browser agents on a fixed stealth and task suite with browser-use benchmark","description":"Compare browser-agent reliability on a repeatable task and anti-bot suite before choosing a stack or claiming progress."},"skills_sh_url":"https://skills.sh/agentskillexchange/skills/benchmark-browser-agents-on-a-fixed-stealth-and-task-suite-with-browser-use-benchmark"},"updatedAt":"2026-05-18T19:09:36.633Z"}}