{"id":"a850b7c2-8efb-4b05-82cd-a3bf7991f0b1","shortId":"4Lh722","kind":"skill","title":"browser-automation","tagline":"Browser automation for AI agents. Two providers — agent-browser (local CLI with Playwright) and agentic-browser (cloud via inference.sh). Both use the same @e ref-based workflow for navigating pages, filling forms, clicking buttons, taking screenshots, extracting data, and automa","description":"# Browser Automation\n\nBrowser automation for AI agents with two provider options. Both share the same core workflow: navigate, snapshot, interact using `@e` refs, re-snapshot after changes.\n\n| Provider | Runtime | Best For |\n|----------|---------|----------|\n| agent-browser | Local (Playwright CLI) | Local testing, iOS Simulator, file:// URLs |\n| agentic-browser | Cloud (inference.sh) | Video recording, cloud execution, parallel sessions |\n\n---\n\n## Core Workflow (Both Providers)\n\nEvery browser automation follows this pattern:\n\n1. **Navigate** — Open a URL\n2. **Snapshot** — Get `@e` refs for interactive elements\n3. **Interact** — Use refs to click, fill, select\n4. **Re-snapshot** — After navigation or DOM changes, get fresh refs\n\n**Important: Refs are invalidated after navigation.** Always re-snapshot after clicking links/buttons, form submissions, or dynamic content loading.\n\n---\n\n## Provider 1: agent-browser (Local CLI)\n\n### Quick Start\n\n```bash\nagent-browser open https://example.com/form\nagent-browser snapshot -i\n# Output: @e1 [input type=\"email\"], @e2 [input type=\"password\"], @e3 [button] \"Submit\"\n\nagent-browser fill @e1 \"user@example.com\"\nagent-browser fill @e2 \"password123\"\nagent-browser click @e3\nagent-browser wait --load networkidle\nagent-browser snapshot -i  # Check result\n```\n\n### Essential Commands\n\n```bash\n# Navigation\nagent-browser open <url>              # Navigate\nagent-browser close                   # Close browser\n\n# Snapshot\nagent-browser snapshot -i             # Interactive elements with refs\nagent-browser snapshot -i -C          # Include cursor-interactive elements\nagent-browser snapshot -s \"#selector\" # Scope to CSS selector\n\n# Interaction (use @refs from snapshot)\nagent-browser click @e1               # Click element\nagent-browser fill @e2 \"text\"         # Clear and type text\nagent-browser type @e2 \"text\"         # Type without clearing\nagent-browser select @e1 \"option\"     # Select dropdown option\nagent-browser check @e1               # Check checkbox\nagent-browser press Enter             # Press key\nagent-browser scroll down 500         # Scroll page\n\n# Get information\nagent-browser get text @e1            # Get element text\nagent-browser get url                 # Get current URL\nagent-browser get title               # Get page title\n\n# Wait\nagent-browser wait @e1                # Wait for element\nagent-browser wait --load networkidle # Wait for network idle\nagent-browser wait --url \"**/page\"    # Wait for URL pattern\nagent-browser wait 2000               # Wait milliseconds\n\n# Capture\nagent-browser screenshot              # Screenshot to temp dir\nagent-browser screenshot --full       # Full page screenshot\nagent-browser pdf output.pdf          # Save as PDF\n```\n\n### Authentication with State Persistence\n\n```bash\n# Login once and save state\nagent-browser open https://app.example.com/login\nagent-browser snapshot -i\nagent-browser fill @e1 \"$USERNAME\"\nagent-browser fill @e2 \"$PASSWORD\"\nagent-browser click @e3\nagent-browser wait --url \"**/dashboard\"\nagent-browser state save auth.json\n\n# Reuse in future sessions\nagent-browser state load auth.json\nagent-browser open https://app.example.com/dashboard\n```\n\n### Parallel Sessions\n\n```bash\nagent-browser --session site1 open https://site-a.com\nagent-browser --session site2 open https://site-b.com\nagent-browser session list\n```\n\n### Visual / Debugging\n\n```bash\nagent-browser --headed open https://example.com\nagent-browser highlight @e1\nagent-browser record start demo.webm\n```\n\n### Local Files\n\n```bash\nagent-browser --allow-file-access open file:///path/to/document.pdf\nagent-browser --allow-file-access open file:///path/to/page.html\nagent-browser screenshot output.png\n```\n\n### iOS Simulator (Mobile Safari)\n\n```bash\n# List available iOS simulators\nagent-browser device list\n\n# Launch Safari on a specific device\nagent-browser -p ios --device \"iPhone 16 Pro\" open https://example.com\n\n# Same workflow — snapshot, interact, re-snapshot\nagent-browser -p ios snapshot -i\nagent-browser -p ios tap @e1\nagent-browser -p ios fill @e2 \"text\"\nagent-browser -p ios swipe up\nagent-browser -p ios screenshot mobile.png\nagent-browser -p ios close\n```\n\n**Requirements:** macOS with Xcode, Appium (`npm install -g appium && appium driver install xcuitest`)\n\n### Semantic Locators (Alternative to Refs)\n\n```bash\nagent-browser find text \"Sign In\" click\nagent-browser find label \"Email\" fill \"user@test.com\"\nagent-browser find role button click --name \"Submit\"\nagent-browser find placeholder \"Search\" type \"query\"\nagent-browser find testid \"submit-btn\" click\n```\n\n---\n\n## Provider 2: agentic-browser (Cloud via inference.sh)\n\n### Quick Start\n\n```bash\n# Install CLI\ncurl -fsSL https://cli.inference.sh | sh && infsh login\n\n# Open a page\ninfsh app run agentic-browser --function open --input '{\"url\": \"https://example.com\"}' --session new\n```\n\n### Core Functions\n\n| Function | Description |\n|----------|-------------|\n| `open` | Navigate to URL, configure browser (viewport, proxy, video) |\n| `snapshot` | Re-fetch page state with `@e` refs after DOM changes |\n| `interact` | Perform actions using `@e` refs |\n| `screenshot` | Take page screenshot (viewport or full page) |\n| `execute` | Run JavaScript code on the page |\n| `close` | Close session, returns video if recording enabled |\n\n### Interact Actions\n\n| Action | Description | Required Fields |\n|--------|-------------|-----------------|\n| `click` | Click element | `ref` |\n| `dblclick` | Double-click | `ref` |\n| `fill` | Clear and type text | `ref`, `text` |\n| `type` | Type without clearing | `text` |\n| `press` | Press key (Enter, Tab) | `text` |\n| `select` | Select dropdown option | `ref`, `text` |\n| `hover` | Hover over element | `ref` |\n| `check` / `uncheck` | Toggle checkbox | `ref` |\n| `drag` | Drag and drop | `ref`, `target_ref` |\n| `upload` | Upload file(s) | `ref`, `file_paths` |\n| `scroll` | Scroll page | `direction`, `scroll_amount` |\n| `back` | Go back in history | - |\n| `wait` | Wait milliseconds | `wait_ms` |\n| `goto` | Navigate to URL | `url` |\n\n### Full Example\n\n```bash\n# Start session\nRESULT=$(infsh app run agentic-browser --function open --session new --input '{\n  \"url\": \"https://example.com/login\"\n}')\nSESSION_ID=$(echo $RESULT | jq -r '.session_id')\n\n# Fill and submit\ninfsh app run agentic-browser --function interact --session $SESSION_ID --input '{\n  \"action\": \"fill\", \"ref\": \"@e1\", \"text\": \"user@example.com\"\n}'\ninfsh app run agentic-browser --function interact --session $SESSION_ID --input '{\n  \"action\": \"fill\", \"ref\": \"@e2\", \"text\": \"password123\"\n}'\ninfsh app run agentic-browser --function interact --session $SESSION_ID --input '{\n  \"action\": \"click\", \"ref\": \"@e3\"\n}'\n\n# Re-snapshot after navigation\ninfsh app run agentic-browser --function snapshot --session $SESSION_ID --input '{}'\n\n# Close when done\ninfsh app run agentic-browser --function close --session $SESSION_ID --input '{}'\n```\n\n### Video Recording\n\n```bash\n# Start with recording enabled\nSESSION=$(infsh app run agentic-browser --function open --session new --input '{\n  \"url\": \"https://example.com\",\n  \"record_video\": true,\n  \"show_cursor\": true\n}' | jq -r '.session_id')\n\n# ... perform actions ...\n\n# Close to get the video file\ninfsh app run agentic-browser --function close --session $SESSION --input '{}'\n# Returns: {\"success\": true, \"video\": <File>}\n```\n\n### Proxy Support\n\n```bash\ninfsh app run agentic-browser --function open --session new --input '{\n  \"url\": \"https://example.com\",\n  \"proxy_url\": \"http://proxy.example.com:8080\",\n  \"proxy_username\": \"user\",\n  \"proxy_password\": \"pass\"\n}'\n```\n\n### File Upload\n\n```bash\ninfsh app run agentic-browser --function interact --session $SESSION --input '{\n  \"action\": \"upload\",\n  \"ref\": \"@e5\",\n  \"file_paths\": [\"/path/to/file.pdf\"]\n}'\n```\n\n### JavaScript Execution\n\n```bash\ninfsh app run agentic-browser --function execute --session $SESSION --input '{\n  \"code\": \"document.querySelectorAll(\\\"h2\\\").length\"\n}'\n# Returns: {\"result\": \"5\", \"screenshot\": <File>}\n```\n\n---\n\n## Common Patterns (Both Providers)\n\n### Form Submission\n1. Open the form URL\n2. Snapshot to get element refs\n3. Fill each field using refs\n4. Click submit button\n5. Wait for navigation/network idle\n6. Re-snapshot to verify result\n\n### Data Extraction\n1. Navigate to target page\n2. Snapshot interactive elements\n3. Get text from specific elements\n4. Optionally use JSON output for parsing\n\n### Authentication Flow\n1. Navigate to login page\n2. Fill credentials\n3. Handle 2FA if prompted\n4. Save session state for reuse\n5. Load saved state in future sessions\n\n---\n\n## Deep-Dive Documentation\n\n| Reference | Description |\n|-----------|-------------|\n| `references/commands.md` | Full command reference with all options |\n| `references/snapshot-refs.md` | Ref lifecycle, invalidation rules, troubleshooting |\n| `references/session-management.md` | Parallel sessions, state persistence |\n| `references/authentication.md` | Login flows, OAuth, 2FA handling |\n| `references/video-recording.md` | Recording workflows for debugging |\n| `references/proxy-support.md` | Proxy configuration, geo-testing |\n\n## Ready-to-Use Templates\n\n| Template | Description |\n|----------|-------------|\n| `templates/form-automation.sh` | Form filling with validation |\n| `templates/authenticated-session.sh` | Login once, reuse state |\n| `templates/capture-workflow.sh` | Content extraction with screenshots |","tags":["browser","automation","coco","rkz91","agent-skills","agents-md","ai-agents","claude-code","codex","cursor","developer-tools","llm-tools"],"capabilities":["skill","source-rkz91","skill-browser-automation","topic-agent-skills","topic-agents-md","topic-ai-agents","topic-claude-code","topic-codex","topic-cursor","topic-developer-tools","topic-llm-tools","topic-mcp","topic-pm-tools","topic-product-management","topic-productivity"],"categories":["coco"],"synonyms":[],"warnings":[],"endpointUrl":"https://skills.sh/rkz91/coco/browser-automation","protocol":"skill","transport":"skills-sh","auth":{"type":"none","details":{"cli":"npx skills add rkz91/coco","source_repo":"https://github.com/rkz91/coco","install_from":"skills.sh"}},"qualityScore":"0.453","qualityRationale":"deterministic score 0.45 from registry signals: · indexed on github topic:agent-skills · 7 github stars · SKILL.md body (9,373 chars)","verified":false,"liveness":"unknown","lastLivenessCheck":null,"agentReviews":{"count":0,"score_avg":null,"cost_usd_avg":null,"success_rate":null,"latency_p50_ms":null,"narrative_summary":null,"summary_updated_at":null},"enrichmentModel":"deterministic:skill-github:v1","enrichmentVersion":1,"enrichedAt":"2026-05-18T19:14:05.983Z","embedding":null,"createdAt":"2026-05-18T13:21:37.444Z","updatedAt":"2026-05-18T19:14:05.983Z","lastSeenAt":"2026-05-18T19:14:05.983Z","tsv":"'/dashboard':467,490 '/form':179 '/login':439,893 '/page':386 '/path/to/document.pdf':544 '/path/to/file.pdf':1088 '/path/to/page.html':553 '1':111,164,1117,1152,1176 '16':586 '2':116,701,1122,1157,1181 '2000':395 '2fa':1186,1230 '3':124,1128,1161,1184 '4':132,1134,1167,1189 '5':1109,1138,1195 '500':332 '6':1143 'access':542,551 'action':762,790,791,917,935,953,1021,1082 'agent':8,12,20,53,80,91,166,174,181,198,204,210,215,221,232,237,244,253,264,279,286,296,305,314,321,328,338,347,355,364,372,382,392,400,408,416,434,441,446,452,458,463,469,479,485,495,502,509,517,523,528,537,546,555,569,580,598,605,612,620,627,634,659,667,675,684,692,703,726,883,909,927,945,966,981,1001,1032,1050,1075,1096 'agent-brows':11,79,165,173,180,197,203,209,214,220,231,236,243,252,263,278,285,295,304,313,320,327,337,346,354,363,371,381,391,399,407,415,433,440,445,451,457,462,468,478,484,494,501,508,516,522,527,536,545,554,568,579,597,604,611,619,626,633,658,666,674,683,691 'agentic-brows':19,90,702,725,882,908,926,944,965,980,1000,1031,1049,1074,1095 'ai':7,52 'allow':540,549 'allow-file-access':539,548 'altern':654 'alway':150 'amount':857 'app':723,880,906,924,942,963,978,998,1029,1047,1072,1093 'app.example.com':438,489 'app.example.com/dashboard':488 'app.example.com/login':437 'appium':643,647,648 'auth.json':473,483 'authent':423,1174 'autom':3,5,48,50,107 'automa':46 'avail':565 'back':858,860 'base':32 'bash':172,229,427,493,515,535,563,657,710,875,991,1045,1070,1091 'best':77 'browser':2,4,13,21,47,49,81,92,106,167,175,182,199,205,211,216,222,233,238,241,245,254,265,280,287,297,306,315,322,329,339,348,356,365,373,383,393,401,409,417,435,442,447,453,459,464,470,480,486,496,503,510,518,524,529,538,547,556,570,581,599,606,613,621,628,635,660,668,676,685,693,704,727,744,884,910,928,946,967,982,1002,1033,1051,1076,1097 'browser-autom':1 'btn':698 'button':40,195,679,1137 'c':257 'captur':398 'chang':74,140,759 'check':225,316,318,833 'checkbox':319,836 'clear':291,303,805,814 'cli':15,84,169,712 'cli.inference.sh':715 'click':39,129,155,212,281,283,460,665,680,699,795,796,802,954,1135 'close':239,240,638,781,782,974,984,1022,1035 'cloud':22,93,97,705 'code':777,1103 'command':228,1210 'common':1111 'configur':743,1239 'content':161,1261 'core':62,101,735 'credenti':1183 'css':271 'curl':713 'current':352 'cursor':260,1014 'cursor-interact':259 'data':44,1150 'dblclick':799 'debug':514,1236 'deep':1203 'deep-div':1202 'demo.webm':532 'descript':738,792,1207,1249 'devic':571,578,584 'dir':406 'direct':855 'dive':1204 'document':1205 'document.queryselectorall':1104 'dom':139,758 'done':976 'doubl':801 'double-click':800 'drag':838,839 'driver':649 'drop':841 'dropdown':311,824 'dynam':160 'e':29,68,119,755,764 'e1':186,201,282,308,317,342,367,449,526,610,920 'e2':190,207,289,299,455,617,938 'e3':194,213,461,956 'e5':1085 'echo':896 'element':123,249,262,284,344,370,797,831,1126,1160,1166 'email':189,671 'enabl':788,995 'enter':324,819 'essenti':227 'everi':105 'exampl':874 'example.com':178,521,589,732,892,1009,1058 'example.com/form':177 'example.com/login':891 'execut':98,774,1090,1099 'extract':43,1151,1262 'fetch':751 'field':794,1131 'file':534,541,550,847,850,1027,1068,1086 'fill':37,130,200,206,288,448,454,616,672,804,902,918,936,1129,1182,1252 'find':661,669,677,686,694 'flow':1175,1228 'follow':108 'form':38,157,1115,1120,1251 'fresh':142 'fssl':714 'full':411,412,772,873,1209 'function':728,736,737,885,911,929,947,968,983,1003,1034,1052,1077,1098 'futur':476,1200 'g':646 'geo':1241 'geo-test':1240 'get':118,141,335,340,343,349,351,357,359,1024,1125,1162 'go':859 'goto':868 'h2':1105 'handl':1185,1231 'head':519 'highlight':525 'histori':862 'hover':828,829 'id':895,901,915,933,951,972,987,1019 'idl':380,1142 'import':144 'includ':258 'inference.sh':24,94,707 'inform':336 'infsh':717,722,879,905,923,941,962,977,997,1028,1046,1071,1092 'input':187,191,730,889,916,934,952,973,988,1007,1038,1056,1081,1102 'instal':645,650,711 'interact':66,122,125,248,261,273,593,760,789,912,930,948,1078,1159 'invalid':147,1218 'io':87,559,566,583,601,608,615,623,630,637 'iphon':585 'javascript':776,1089 'jq':898,1016 'json':1170 'key':326,818 'label':670 'launch':573 'length':1106 'lifecycl':1217 'links/buttons':156 'list':512,564,572 'load':162,218,375,482,1196 'local':14,82,85,168,533 'locat':653 'login':428,718,1179,1227,1256 'maco':640 'millisecond':397,865 'mobil':561 'mobile.png':632 'ms':867 'name':681 'navig':35,64,112,137,149,230,235,740,869,961,1153,1177 'navigation/network':1141 'network':379 'networkidl':219,376 'new':734,888,1006,1055 'npm':644 'oauth':1229 'open':113,176,234,436,487,499,506,520,543,552,588,719,729,739,886,1004,1053,1118 'option':57,309,312,825,1168,1214 'output':185,1171 'output.pdf':419 'output.png':558 'p':582,600,607,614,622,629,636 'page':36,334,360,413,721,752,768,773,780,854,1156,1180 'parallel':99,491,1222 'pars':1173 'pass':1067 'password':193,456,1066 'password123':208,940 'path':851,1087 'pattern':110,390,1112 'pdf':418,422 'perform':761,1020 'persist':426,1225 'placehold':687 'playwright':17,83 'press':323,325,816,817 'pro':587 'prompt':1188 'provid':10,56,75,104,163,700,1114 'proxi':746,1043,1059,1062,1065,1238 'proxy.example.com:8080':1061 'queri':690 'quick':170,708 'r':899,1017 're':71,134,152,595,750,958,1145 're-fetch':749 're-snapshot':70,133,151,594,957,1144 'readi':1244 'ready-to-us':1243 'record':96,530,787,990,994,1010,1233 'ref':31,69,120,127,143,145,251,275,656,756,765,798,803,809,826,832,837,842,844,849,919,937,955,1084,1127,1133,1216 'ref-bas':30 'refer':1206,1211 'references/authentication.md':1226 'references/commands.md':1208 'references/proxy-support.md':1237 'references/session-management.md':1221 'references/snapshot-refs.md':1215 'references/video-recording.md':1232 'requir':639,793 'result':226,878,897,1108,1149 'return':784,1039,1107 'reus':474,1194,1258 'role':678 'rule':1219 'run':724,775,881,907,925,943,964,979,999,1030,1048,1073,1094 'runtim':76 'safari':562,574 'save':420,431,472,1190,1197 'scope':269 'screenshot':42,402,403,410,414,557,631,766,769,1110,1264 'scroll':330,333,852,853,856 'search':688 'select':131,307,310,822,823 'selector':268,272 'semant':652 'session':100,477,492,497,504,511,733,783,877,887,894,900,913,914,931,932,949,950,970,971,985,986,996,1005,1018,1036,1037,1054,1079,1080,1100,1101,1191,1201,1223 'sh':716 'share':59 'show':1013 'sign':663 'simul':88,560,567 'site-a.com':500 'site-b.com':507 'site1':498 'site2':505 'skill' 'skill-browser-automation' 'snapshot':65,72,117,135,153,183,223,242,246,255,266,277,443,592,596,602,748,959,969,1123,1146,1158 'source-rkz91' 'specif':577,1165 'start':171,531,709,876,992 'state':425,432,471,481,753,1192,1198,1224,1259 'submiss':158,1116 'submit':196,682,697,904,1136 'submit-btn':696 'success':1040 'support':1044 'swipe':624 'tab':820 'take':41,767 'tap':609 'target':843,1155 'temp':405 'templat':1247,1248 'templates/authenticated-session.sh':1255 'templates/capture-workflow.sh':1260 'templates/form-automation.sh':1250 'test':86,1242 'testid':695 'text':290,294,300,341,345,618,662,808,810,815,821,827,921,939,1163 'titl':358,361 'toggl':835 'topic-agent-skills' 'topic-agents-md' 'topic-ai-agents' 'topic-claude-code' 'topic-codex' 'topic-cursor' 'topic-developer-tools' 'topic-llm-tools' 'topic-mcp' 'topic-pm-tools' 'topic-product-management' 'topic-productivity' 'troubleshoot':1220 'true':1012,1015,1041 'two':9,55 'type':188,192,293,298,301,689,807,811,812 'uncheck':834 'upload':845,846,1069,1083 'url':89,115,350,353,385,389,466,731,742,871,872,890,1008,1057,1060,1121 'use':26,67,126,274,763,1132,1169,1246 'user':1064 'user@example.com':202,922 'user@test.com':673 'usernam':450,1063 'valid':1254 'verifi':1148 'via':23,706 'video':95,747,785,989,1011,1026,1042 'viewport':745,770 'visual':513 'wait':217,362,366,368,374,377,384,387,394,396,465,863,864,866,1139 'without':302,813 'workflow':33,63,102,591,1234 'xcode':642 'xcuitest':651","prices":[{"id":"303a7e51-5b44-4818-acad-125a8e4bd9b2","listingId":"a850b7c2-8efb-4b05-82cd-a3bf7991f0b1","amountUsd":"0","unit":"free","nativeCurrency":null,"nativeAmount":null,"chain":null,"payTo":null,"paymentMethod":"skill-free","isPrimary":true,"details":{"org":"rkz91","category":"coco","install_from":"skills.sh"},"createdAt":"2026-05-18T13:21:37.444Z"}],"sources":[{"listingId":"a850b7c2-8efb-4b05-82cd-a3bf7991f0b1","source":"github","sourceId":"rkz91/coco/browser-automation","sourceUrl":"https://github.com/rkz91/coco/tree/main/skills/browser-automation","isPrimary":false,"firstSeenAt":"2026-05-18T13:21:37.444Z","lastSeenAt":"2026-05-18T19:14:05.983Z"}],"details":{"listingId":"a850b7c2-8efb-4b05-82cd-a3bf7991f0b1","quickStartSnippet":null,"exampleRequest":null,"exampleResponse":null,"schema":null,"openapiUrl":null,"agentsTxtUrl":null,"citations":[],"useCases":[],"bestFor":[],"notFor":[],"kindDetails":{"org":"rkz91","slug":"browser-automation","github":{"repo":"rkz91/coco","stars":7,"topics":["agent-skills","agents-md","ai","ai-agents","claude-code","codex","cursor","developer-tools","llm-tools","mcp","pm-tools","product-management","productivity","prompt-engineering","workflow-automation"],"license":"mit","html_url":"https://github.com/rkz91/coco","pushed_at":"2026-04-26T01:51:27Z","description":"Open-source library of AI superpowers — 59 skills, 34 commands, 10 agents + 24 GSD subagents, 3 system bundles. An entire team, wherever your AI lives. Vendor-neutral across Claude Code, Cursor, Codex, and any AGENTS.md tool.","skill_md_sha":"b766d15569f62917140a5fabdeb8cbe5c2309e42","skill_md_path":"skills/browser-automation/SKILL.md","default_branch":"main","skill_tree_url":"https://github.com/rkz91/coco/tree/main/skills/browser-automation"},"layout":"multi","source":"github","category":"coco","frontmatter":{"name":"browser-automation","description":"Browser automation for AI agents. Two providers — agent-browser (local CLI with Playwright) and agentic-browser (cloud via inference.sh). Both use the same @e ref-based workflow for navigating pages, filling forms, clicking buttons, taking screenshots, extracting data, and automating browser tasks."},"skills_sh_url":"https://skills.sh/rkz91/coco/browser-automation"},"updatedAt":"2026-05-18T19:14:05.983Z"}}