{"id":"4423eb92-0fbe-4ba1-bb80-770327c2472b","shortId":"3fX53W","kind":"skill","title":"Drive web and app UIs with vision-grounded steps when selectors are brittle or unavailable","tagline":"Use Midscene.js when an agent needs screenshot-grounded UI actions and assertions across web, mobile, or desktop surfaces where DOM selectors are fragile, unavailable, or not the right abstraction.","description":"# Drive web and app UIs with vision-grounded steps when selectors are brittle or unavailable\n\nUse Midscene.js when an agent needs screenshot-grounded UI actions and assertions across web, mobile, or desktop surfaces where DOM selectors are fragile, unavailable, or not the right abstraction.\n\n## Prerequisites\n\nMidscene.js, Node.js, a supported vision model, and a target automation surface such as Playwright, Puppeteer, Android adb, or iOS WebDriverAgent\n\n## Installation\n\nRequirements and caveats from upstream:\n- [midscene-pc-docker](https://github.com/Mofangbao/midscene-pc-docker) - Docker image with Midscene-PC server pre-installed\n- [Midscene-Python](https://github.com/Python51888/Midscene-Python) - Python SDK for Midscene automation\n\nBasic usage or getting-started notes:\n- Sample Projects: [https://github.com/web-infra-dev/midscene-example](https://github.com/web-infra-dev/midscene-example)\n\n- Source: https://github.com/web-infra-dev/midscene\n- Extracted from upstream docs: https://raw.githubusercontent.com/web-infra-dev/midscene/HEAD/README.md\n\n## Documentation\n\n- https://midscenejs.com\n\n## Source\n\n- [Agent Skill Exchange](https://agentskillexchange.com/skills/drive-web-and-app-uis-with-vision-grounded-steps-when-selectors-are-brittle-or-unavailable/)","tags":["drive","web","and","app","uis","with","vision","grounded","steps","when","selectors","are"],"capabilities":["skill","source-agentskillexchange","skill-drive-web-and-app-uis-with-vision-grounded-steps-when-selectors-are-brittle-or-unavailable","topic-agent-skills","topic-ai-agents","topic-ai-tools","topic-awesome-list","topic-claude-code","topic-codex","topic-cursor","topic-llm","topic-mcp","topic-npx-skills","topic-openclaw","topic-skills-catalog"],"categories":["skills"],"synonyms":[],"warnings":[],"endpointUrl":"https://skills.sh/agentskillexchange/skills/drive-web-and-app-uis-with-vision-grounded-steps-when-selectors-are-brittle-or-unavailable","protocol":"skill","transport":"skills-sh","auth":{"type":"none","details":{"cli":"npx skills add agentskillexchange/skills","source_repo":"https://github.com/agentskillexchange/skills","install_from":"skills.sh"}},"qualityScore":"0.454","qualityRationale":"deterministic score 0.45 from registry signals: · indexed on github topic:agent-skills · 8 github stars · SKILL.md body (1,274 chars)","verified":false,"liveness":"unknown","lastLivenessCheck":null,"agentReviews":{"count":0,"score_avg":null,"cost_usd_avg":null,"success_rate":null,"latency_p50_ms":null,"narrative_summary":null,"summary_updated_at":null},"enrichmentModel":"deterministic:skill-github:v1","enrichmentVersion":1,"enrichedAt":"2026-05-18T19:10:17.230Z","embedding":null,"createdAt":"2026-05-18T13:16:19.019Z","updatedAt":"2026-05-18T19:10:17.230Z","lastSeenAt":"2026-05-18T19:10:17.230Z","tsv":"'/mofangbao/midscene-pc-docker)':126 '/python51888/midscene-python)':142 '/skills/drive-web-and-app-uis-with-vision-grounded-steps-when-selectors-are-brittle-or-unavailable/)':179 '/web-infra-dev/midscene':163 '/web-infra-dev/midscene-example](https://github.com/web-infra-dev/midscene-example)':159 '/web-infra-dev/midscene/head/readme.md':170 'abstract':46,92 'across':30,76 'action':27,73 'adb':110 'agent':21,67,174 'agentskillexchange.com':178 'agentskillexchange.com/skills/drive-web-and-app-uis-with-vision-grounded-steps-when-selectors-are-brittle-or-unavailable/)':177 'android':109 'app':4,50 'assert':29,75 'autom':103,147 'basic':148 'brittl':14,60 'caveat':117 'desktop':34,80 'doc':167 'docker':123,127 'document':171 'dom':37,83 'drive':1,47 'exchang':176 'extract':164 'fragil':40,86 'get':152 'getting-start':151 'github.com':125,141,158,162 'github.com/mofangbao/midscene-pc-docker)':124 'github.com/python51888/midscene-python)':140 'github.com/web-infra-dev/midscene':161 'github.com/web-infra-dev/midscene-example](https://github.com/web-infra-dev/midscene-example)':157 'ground':9,25,55,71 'imag':128 'instal':114,136 'io':112 'midscen':121,131,138,146 'midscene-pc':130 'midscene-pc-dock':120 'midscene-python':137 'midscene.js':18,64,94 'midscenejs.com':172 'mobil':32,78 'model':99 'need':22,68 'node.js':95 'note':154 'pc':122,132 'playwright':107 'pre':135 'pre-instal':134 'prerequisit':93 'project':156 'puppet':108 'python':139,143 'raw.githubusercontent.com':169 'raw.githubusercontent.com/web-infra-dev/midscene/head/readme.md':168 'requir':115 'right':45,91 'sampl':155 'screenshot':24,70 'screenshot-ground':23,69 'sdk':144 'selector':12,38,58,84 'server':133 'skill':175 'skill-drive-web-and-app-uis-with-vision-grounded-steps-when-selectors-are-brittle-or-unavailable' 'sourc':160,173 'source-agentskillexchange' 'start':153 'step':10,56 'support':97 'surfac':35,81,104 'target':102 'topic-agent-skills' 'topic-ai-agents' 'topic-ai-tools' 'topic-awesome-list' 'topic-claude-code' 'topic-codex' 'topic-cursor' 'topic-llm' 'topic-mcp' 'topic-npx-skills' 'topic-openclaw' 'topic-skills-catalog' 'ui':5,26,51,72 'unavail':16,41,62,87 'upstream':119,166 'usag':149 'use':17,63 'vision':8,54,98 'vision-ground':7,53 'web':2,31,48,77 'webdriverag':113","prices":[{"id":"69edb59c-47b0-47bb-b344-ca454e50cac6","listingId":"4423eb92-0fbe-4ba1-bb80-770327c2472b","amountUsd":"0","unit":"free","nativeCurrency":null,"nativeAmount":null,"chain":null,"payTo":null,"paymentMethod":"skill-free","isPrimary":true,"details":{"org":"agentskillexchange","category":"skills","install_from":"skills.sh"},"createdAt":"2026-05-18T13:16:19.019Z"}],"sources":[{"listingId":"4423eb92-0fbe-4ba1-bb80-770327c2472b","source":"github","sourceId":"agentskillexchange/skills/drive-web-and-app-uis-with-vision-grounded-steps-when-selectors-are-brittle-or-unavailable","sourceUrl":"https://github.com/agentskillexchange/skills/tree/main/skills/drive-web-and-app-uis-with-vision-grounded-steps-when-selectors-are-brittle-or-unavailable","isPrimary":false,"firstSeenAt":"2026-05-18T13:16:19.019Z","lastSeenAt":"2026-05-18T19:10:17.230Z"}],"details":{"listingId":"4423eb92-0fbe-4ba1-bb80-770327c2472b","quickStartSnippet":null,"exampleRequest":null,"exampleResponse":null,"schema":null,"openapiUrl":null,"agentsTxtUrl":null,"citations":[],"useCases":[],"bestFor":[],"notFor":[],"kindDetails":{"org":"agentskillexchange","slug":"drive-web-and-app-uis-with-vision-grounded-steps-when-selectors-are-brittle-or-unavailable","github":{"repo":"agentskillexchange/skills","stars":8,"topics":["agent-skills","ai-agents","ai-tools","awesome-list","claude-code","codex","cursor","llm","mcp","npx-skills","openclaw","skills-catalog"],"license":"mit","html_url":"https://github.com/agentskillexchange/skills","pushed_at":"2026-05-18T19:02:17Z","description":"The open catalog of AI agent skills — 2,000+ security-scanned skills for Claude Code, Cursor, Codex, and more.","skill_md_sha":"9e18d597b587ac10fcdc694e2d960b7da2522fcc","skill_md_path":"skills/drive-web-and-app-uis-with-vision-grounded-steps-when-selectors-are-brittle-or-unavailable/SKILL.md","default_branch":"main","skill_tree_url":"https://github.com/agentskillexchange/skills/tree/main/skills/drive-web-and-app-uis-with-vision-grounded-steps-when-selectors-are-brittle-or-unavailable"},"layout":"multi","source":"github","category":"skills","frontmatter":{"name":"Drive web and app UIs with vision-grounded steps when selectors are brittle or unavailable","description":"Use Midscene.js when an agent needs screenshot-grounded UI actions and assertions across web, mobile, or desktop surfaces where DOM selectors are fragile, unavailable, or not the right abstraction."},"skills_sh_url":"https://skills.sh/agentskillexchange/skills/drive-web-and-app-uis-with-vision-grounded-steps-when-selectors-are-brittle-or-unavailable"},"updatedAt":"2026-05-18T19:10:17.230Z"}}