{"id":"de26b630-558f-4bdf-896c-b008c90c4fdd","shortId":"GuCcHu","kind":"skill","title":"pilot-fleet-health-monitor-setup","tagline":"Deploy a fleet health monitoring system with 3 agents.  Use this skill when: 1. User wants to set up fleet or server health monitoring 2. User is configuring an agent as part of a health monitoring setup 3. User asks about monitoring, alerting, or metrics collection across agents","description":"# Fleet Health Monitor Setup\n\nDeploy 3 agents that monitor server health and aggregate alerts.\n\n## Roles\n\n| Role | Hostname | Skills | Purpose |\n|------|----------|--------|---------|\n| web-monitor | `<prefix>-web-monitor` | pilot-health, pilot-alert, pilot-metrics | Monitors web servers, publishes health alerts |\n| db-monitor | `<prefix>-db-monitor` | pilot-health, pilot-alert, pilot-metrics | Monitors databases, publishes health alerts |\n| alert-hub | `<prefix>-alert-hub` | pilot-webhook-bridge, pilot-alert, pilot-event-filter, pilot-slack-bridge | Aggregates alerts, forwards to humans |\n\n## Setup Procedure\n\n**Step 1:** Ask the user which role this agent should play and what prefix to use.\n\n**Step 2:** Install the skills for the chosen role:\n```bash\n# For web-monitor or db-monitor:\nclawhub install pilot-health pilot-alert pilot-metrics\n\n# For alert-hub:\nclawhub install pilot-webhook-bridge pilot-alert pilot-event-filter pilot-slack-bridge\n```\n\n**Step 3:** Set the hostname:\n```bash\npilotctl --json set-hostname <prefix>-<role>\n```\n\n**Step 4:** Write the setup manifest:\n```bash\nmkdir -p ~/.pilot/setups\ncat > ~/.pilot/setups/fleet-health-monitor.json << 'MANIFEST'\n{\n  \"setup\": \"fleet-health-monitor\",\n  \"setup_name\": \"Fleet Health Monitor\",\n  \"role\": \"<ROLE_ID>\",\n  \"role_name\": \"<ROLE_NAME>\",\n  \"hostname\": \"<prefix>-<role>\",\n  \"description\": \"<ROLE_DESCRIPTION>\",\n  \"skills\": { \"<skill>\": \"<contextual description>\" },\n  \"peers\": [ { \"role\": \"...\", \"hostname\": \"...\", \"description\": \"...\" } ],\n  \"data_flows\": [ { \"direction\": \"send|receive\", \"peer\": \"...\", \"port\": 1002, \"topic\": \"...\", \"description\": \"...\" } ],\n  \"handshakes_needed\": [ \"<peer-hostname>\" ]\n}\nMANIFEST\n```\n\n**Step 5:** Tell the user to initiate handshakes with direct communication peers.\n\n## Manifest Templates Per Role\n\n### web-monitor\n```json\n{\n  \"setup\": \"fleet-health-monitor\",\n  \"setup_name\": \"Fleet Health Monitor\",\n  \"role\": \"web-monitor\",\n  \"role_name\": \"Web Server Monitor\",\n  \"hostname\": \"<prefix>-web-monitor\",\n  \"description\": \"Watches nginx/app health, CPU, memory, and response times. Emits alert events when thresholds are breached.\",\n  \"skills\": {\n    \"pilot-health\": \"Check nginx, app endpoints, SSL certs. Run on schedule or on-demand.\",\n    \"pilot-alert\": \"When health checks fail, publish alert to <prefix>-alert-hub on topic health-alert.\",\n    \"pilot-metrics\": \"Collect CPU, memory, disk, and response time. Format as JSON event payloads.\"\n  },\n  \"peers\": [\n    { \"role\": \"db-monitor\", \"hostname\": \"<prefix>-db-monitor\", \"description\": \"Fellow monitor — does not communicate directly\" },\n    { \"role\": \"alert-hub\", \"hostname\": \"<prefix>-alert-hub\", \"description\": \"Central alert aggregator — receives health-alert events\" }\n  ],\n  \"data_flows\": [\n    { \"direction\": \"send\", \"peer\": \"<prefix>-alert-hub\", \"port\": 1002, \"topic\": \"health-alert\", \"description\": \"Health check failures and metric anomalies\" }\n  ],\n  \"handshakes_needed\": [\"<prefix>-alert-hub\"]\n}\n```\n\n### db-monitor\n```json\n{\n  \"setup\": \"fleet-health-monitor\",\n  \"setup_name\": \"Fleet Health Monitor\",\n  \"role\": \"db-monitor\",\n  \"role_name\": \"Database Monitor\",\n  \"hostname\": \"<prefix>-db-monitor\",\n  \"description\": \"Monitors database connections, query latency, replication lag, and disk usage. Emits alerts on anomalies.\",\n  \"skills\": {\n    \"pilot-health\": \"Check PostgreSQL/MySQL connections, replication lag, disk usage.\",\n    \"pilot-alert\": \"When DB health fails, publish alert to <prefix>-alert-hub on topic health-alert.\",\n    \"pilot-metrics\": \"Collect query latency, connection pool stats, table sizes.\"\n  },\n  \"peers\": [\n    { \"role\": \"web-monitor\", \"hostname\": \"<prefix>-web-monitor\", \"description\": \"Fellow monitor — does not communicate directly\" },\n    { \"role\": \"alert-hub\", \"hostname\": \"<prefix>-alert-hub\", \"description\": \"Central alert aggregator — receives health-alert events\" }\n  ],\n  \"data_flows\": [\n    { \"direction\": \"send\", \"peer\": \"<prefix>-alert-hub\", \"port\": 1002, \"topic\": \"health-alert\", \"description\": \"Database alerts and replication warnings\" }\n  ],\n  \"handshakes_needed\": [\"<prefix>-alert-hub\"]\n}\n```\n\n### alert-hub\n```json\n{\n  \"setup\": \"fleet-health-monitor\",\n  \"setup_name\": \"Fleet Health Monitor\",\n  \"role\": \"alert-hub\",\n  \"role_name\": \"Alert Aggregator\",\n  \"hostname\": \"<prefix>-alert-hub\",\n  \"description\": \"Receives alerts from all monitors, filters duplicates and noise, then forwards critical alerts to Slack and PagerDuty via webhooks.\",\n  \"skills\": {\n    \"pilot-webhook-bridge\": \"Forward critical alerts to Slack and PagerDuty via webhook URLs.\",\n    \"pilot-alert\": \"Subscribe to health-alert from all monitors. Aggregate and deduplicate.\",\n    \"pilot-event-filter\": \"Filter noise and low-severity alerts before forwarding.\",\n    \"pilot-slack-bridge\": \"Post formatted alert summaries to Slack channels.\"\n  },\n  \"peers\": [\n    { \"role\": \"web-monitor\", \"hostname\": \"<prefix>-web-monitor\", \"description\": \"Sends health alerts from web servers\" },\n    { \"role\": \"db-monitor\", \"hostname\": \"<prefix>-db-monitor\", \"description\": \"Sends health alerts from databases\" }\n  ],\n  \"data_flows\": [\n    { \"direction\": \"receive\", \"peer\": \"<prefix>-web-monitor\", \"port\": 1002, \"topic\": \"health-alert\", \"description\": \"Health check failures and metric anomalies\" },\n    { \"direction\": \"receive\", \"peer\": \"<prefix>-db-monitor\", \"port\": 1002, \"topic\": \"health-alert\", \"description\": \"Database alerts and replication warnings\" },\n    { \"direction\": \"send\", \"peer\": \"external\", \"port\": 443, \"topic\": \"slack-forward\", \"description\": \"Filtered alerts to Slack and PagerDuty\" }\n  ],\n  \"handshakes_needed\": [\"<prefix>-web-monitor\", \"<prefix>-db-monitor\"]\n}\n```\n\n## Data Flows\n\n- `web-monitor → alert-hub` : health-alert events (port 1002)\n- `db-monitor → alert-hub` : health-alert events (port 1002)\n- `alert-hub → humans` : forwarded alerts via webhook/announce\n\n## Handshakes\n\n```bash\n# web-monitor and db-monitor handshake with alert-hub:\npilotctl --json handshake <prefix>-alert-hub \"setup: fleet-health-monitor\"\n\n# alert-hub handshakes with both monitors:\npilotctl --json handshake <prefix>-web-monitor \"setup: fleet-health-monitor\"\npilotctl --json handshake <prefix>-db-monitor \"setup: fleet-health-monitor\"\n```\n\n## Workflow Example\n\n```bash\n# On alert-hub — subscribe to health events:\npilotctl --json subscribe <prefix>-web-monitor health-alert\npilotctl --json subscribe <prefix>-db-monitor health-alert\n\n# On web-monitor — publish a health alert:\npilotctl --json publish <prefix>-alert-hub health-alert '{\"host\":\"web-01\",\"status\":\"critical\",\"cpu\":95,\"mem\":88}'\n\n# On db-monitor — publish a database alert:\npilotctl --json publish <prefix>-alert-hub health-alert '{\"host\":\"db-01\",\"status\":\"warning\",\"disk_pct\":88,\"repl_lag_ms\":450}'\n```\n\n## Dependencies\n\nRequires `pilot-protocol` skill, `pilotctl` binary, `clawhub` binary, and a running daemon.","tags":["pilot","fleet","health","monitor","setup","skills","teoslayer","agent-skills","ai-agents","clawhub","networking","openclaw"],"capabilities":["skill","source-teoslayer","skill-pilot-fleet-health-monitor-setup","topic-agent-skills","topic-ai-agents","topic-clawhub","topic-networking","topic-openclaw","topic-overlay-network","topic-p2p","topic-pilot-protocol"],"categories":["pilot-skills"],"synonyms":[],"warnings":[],"endpointUrl":"https://skills.sh/TeoSlayer/pilot-skills/pilot-fleet-health-monitor-setup","protocol":"skill","transport":"skills-sh","auth":{"type":"none","details":{"cli":"npx skills add TeoSlayer/pilot-skills","source_repo":"https://github.com/TeoSlayer/pilot-skills","install_from":"skills.sh"}},"qualityScore":"0.453","qualityRationale":"deterministic score 0.45 from registry signals: · indexed on github topic:agent-skills · 6 github stars · SKILL.md body (6,841 chars)","verified":false,"liveness":"unknown","lastLivenessCheck":null,"agentReviews":{"count":0,"score_avg":null,"cost_usd_avg":null,"success_rate":null,"latency_p50_ms":null,"narrative_summary":null,"summary_updated_at":null},"enrichmentModel":"deterministic:skill-github:v1","enrichmentVersion":1,"enrichedAt":"2026-05-18T19:14:55.870Z","embedding":null,"createdAt":"2026-05-18T13:22:40.598Z","updatedAt":"2026-05-18T19:14:55.870Z","lastSeenAt":"2026-05-18T19:14:55.870Z","tsv":"'-01':902,928 '/.pilot/setups':229 '/.pilot/setups/fleet-health-monitor.json':231 '1':20,144 '1002':260,417,557,711,730,779,791 '2':31,160 '3':14,44,60,210 '4':221 '443':746 '450':937 '5':267 '88':908,933 '95':906 'across':53 'agent':15,36,54,61,151 'aggreg':67,136,402,542,594,645 'alert':49,68,85,94,106,114,116,119,127,137,184,190,200,319,344,350,353,359,393,397,401,406,414,421,432,472,488,494,497,503,533,537,541,546,554,561,564,571,574,589,593,597,601,612,626,636,641,658,667,684,699,715,734,737,753,772,776,784,788,793,797,812,818,826,859,873,882,890,895,899,916,921,925 'alert-hub':115,118,189,352,392,396,413,431,496,532,536,553,570,573,588,596,771,783,792,811,817,825,858,894,920 'anomali':428,474,722 'app':331 'ask':46,145 'bash':168,214,226,801,856 'binari':945,947 'breach':324 'bridg':124,135,197,208,623,664 'cat':230 'central':400,540 'cert':334 'channel':671 'check':329,347,424,479,718 'chosen':166 'clawhub':177,192,946 'collect':52,363,507 'communic':276,389,529 'configur':34 'connect':463,481,510 'cpu':313,364,905 'critic':611,625,904 'daemon':951 'data':253,408,548,702,766 'databas':111,454,462,563,701,736,915 'db':96,99,175,378,382,435,450,458,490,690,694,727,764,781,807,847,878,911,927 'db-monitor':95,98,174,377,381,434,449,457,689,693,726,763,780,806,846,877,910 'dedupl':647 'demand':341 'depend':938 'deploy':7,59 'descript':247,252,262,309,384,399,422,460,524,539,562,599,681,696,716,735,751 'direct':255,275,390,410,530,550,704,723,741 'disk':366,469,484,931 'duplic':606 'emit':318,471 'endpoint':332 'event':130,203,320,373,407,547,650,777,789,864 'exampl':855 'extern':744 'fail':348,492 'failur':425,719 'fellow':385,525 'filter':131,204,605,651,652,752 'fleet':3,9,26,55,235,240,288,293,440,445,579,584,822,840,851 'fleet-health-monitor':234,287,439,578,821,839,850 'flow':254,409,549,703,767 'format':370,666 'forward':138,610,624,660,750,796 'handshak':263,273,429,568,758,800,809,816,828,834,845 'health':4,10,29,41,56,65,82,93,103,113,181,236,241,289,294,312,328,346,358,405,420,423,441,446,478,491,502,545,560,580,585,640,683,698,714,717,733,775,787,823,841,852,863,872,881,889,898,924 'health-alert':357,404,419,501,544,559,639,713,732,774,786,871,880,897,923 'host':900,926 'hostnam':71,213,219,246,251,305,380,395,456,520,535,595,677,692 'hub':117,120,191,354,394,398,415,433,498,534,538,555,572,575,590,598,773,785,794,813,819,827,860,896,922 'human':140,795 'initi':272 'instal':161,178,193 'json':216,285,372,437,576,815,833,844,866,875,892,918 'lag':467,483,935 'latenc':465,509 'low':656 'low-sever':655 'manifest':225,232,265,278 'mem':907 'memori':314,365 'metric':51,88,109,187,362,427,506,721 'mkdir':227 'monitor':5,11,30,42,48,57,63,76,79,89,97,100,110,172,176,237,242,284,290,295,299,304,308,379,383,386,436,442,447,451,455,459,461,519,523,526,581,586,604,644,676,680,691,695,709,728,762,765,770,782,804,808,824,831,837,842,848,853,870,879,886,912 'ms':936 'name':239,245,292,301,444,453,583,592 'need':264,430,569,759 'nginx':330 'nginx/app':311 'nois':608,653 'on-demand':339 'p':228 'pagerduti':616,630,757 'part':38 'payload':374 'pct':932 'peer':249,258,277,375,412,515,552,672,706,725,743 'per':280 'pilot':2,81,84,87,102,105,108,122,126,129,133,180,183,186,195,199,202,206,327,343,361,477,487,505,621,635,649,662,941 'pilot-alert':83,104,125,182,198,342,486,634 'pilot-event-filt':128,201,648 'pilot-fleet-health-monitor-setup':1 'pilot-health':80,101,179,326,476 'pilot-metr':86,107,185,360,504 'pilot-protocol':940 'pilot-slack-bridg':132,205,661 'pilot-webhook-bridg':121,194,620 'pilotctl':215,814,832,843,865,874,891,917,944 'play':153 'pool':511 'port':259,416,556,710,729,745,778,790 'post':665 'postgresql/mysql':480 'prefix':156 'procedur':142 'protocol':942 'publish':92,112,349,493,887,893,913,919 'purpos':73 'queri':464,508 'receiv':257,403,543,600,705,724 'repl':934 'replic':466,482,566,739 'requir':939 'respons':316,368 'role':69,70,149,167,243,244,250,281,296,300,376,391,448,452,516,531,587,591,673,688 'run':335,950 'schedul':337 'send':256,411,551,682,697,742 'server':28,64,91,303,687 'set':24,211,218 'set-hostnam':217 'setup':6,43,58,141,224,233,238,286,291,438,443,577,582,820,838,849 'sever':657 'size':514 'skill':18,72,163,248,325,475,619,943 'skill-pilot-fleet-health-monitor-setup' 'slack':134,207,614,628,663,670,749,755 'slack-forward':748 'source-teoslayer' 'ssl':333 'stat':512 'status':903,929 'step':143,159,209,220,266 'subscrib':637,861,867,876 'summari':668 'system':12 'tabl':513 'tell':268 'templat':279 'threshold':322 'time':317,369 'topic':261,356,418,500,558,712,731,747 'topic-agent-skills' 'topic-ai-agents' 'topic-clawhub' 'topic-networking' 'topic-openclaw' 'topic-overlay-network' 'topic-p2p' 'topic-pilot-protocol' 'url':633 'usag':470,485 'use':16,158 'user':21,32,45,147,270 'via':617,631,798 'want':22 'warn':567,740,930 'watch':310 'web':75,78,90,171,283,298,302,307,518,522,675,679,686,708,761,769,803,836,869,885,901 'web-monitor':74,77,170,282,297,306,517,521,674,678,707,760,768,802,835,868,884 'webhook':123,196,618,622,632 'webhook/announce':799 'workflow':854 'write':222","prices":[{"id":"11df61d5-58f3-43a7-872f-5cd2bee3f102","listingId":"de26b630-558f-4bdf-896c-b008c90c4fdd","amountUsd":"0","unit":"free","nativeCurrency":null,"nativeAmount":null,"chain":null,"payTo":null,"paymentMethod":"skill-free","isPrimary":true,"details":{"org":"TeoSlayer","category":"pilot-skills","install_from":"skills.sh"},"createdAt":"2026-05-18T13:22:40.598Z"}],"sources":[{"listingId":"de26b630-558f-4bdf-896c-b008c90c4fdd","source":"github","sourceId":"TeoSlayer/pilot-skills/pilot-fleet-health-monitor-setup","sourceUrl":"https://github.com/TeoSlayer/pilot-skills/tree/main/skills/pilot-fleet-health-monitor-setup","isPrimary":false,"firstSeenAt":"2026-05-18T13:22:40.598Z","lastSeenAt":"2026-05-18T19:14:55.870Z"}],"details":{"listingId":"de26b630-558f-4bdf-896c-b008c90c4fdd","quickStartSnippet":null,"exampleRequest":null,"exampleResponse":null,"schema":null,"openapiUrl":null,"agentsTxtUrl":null,"citations":[],"useCases":[],"bestFor":[],"notFor":[],"kindDetails":{"org":"TeoSlayer","slug":"pilot-fleet-health-monitor-setup","github":{"repo":"TeoSlayer/pilot-skills","stars":6,"topics":["agent-skills","ai-agents","clawhub","networking","openclaw","overlay-network","p2p","pilot-protocol"],"license":"agpl-3.0","html_url":"https://github.com/TeoSlayer/pilot-skills","pushed_at":"2026-05-13T06:08:49Z","description":"80+ agent skills for Pilot Protocol — communication, file transfer, trust, task routing, swarm coordination, and more","skill_md_sha":"bfdd52f1050d4273f2703ad60b86a23d41cd820d","skill_md_path":"skills/pilot-fleet-health-monitor-setup/SKILL.md","default_branch":"main","skill_tree_url":"https://github.com/TeoSlayer/pilot-skills/tree/main/skills/pilot-fleet-health-monitor-setup"},"layout":"multi","source":"github","category":"pilot-skills","frontmatter":{"name":"pilot-fleet-health-monitor-setup","license":"AGPL-3.0","description":"Deploy a fleet health monitoring system with 3 agents.  Use this skill when: 1. User wants to set up fleet or server health monitoring 2. User is configuring an agent as part of a health monitoring setup 3. User asks about monitoring, alerting, or metrics collection across agents  Do NOT use this skill when: - User wants a single health check (use pilot-health instead) - User wants to send a one-off alert (use pilot-alert instead)"},"skills_sh_url":"https://skills.sh/TeoSlayer/pilot-skills/pilot-fleet-health-monitor-setup"},"updatedAt":"2026-05-18T19:14:55.870Z"}}