{"id":"499fe22c-0f0f-48d6-aec5-1e71c89941dd","shortId":"be5wjQ","kind":"skill","title":"Azure Resource Health Diagnose","tagline":"Awesome Copilot skill by Github","description":"# Azure Resource Health & Issue Diagnosis\n\nThis workflow analyzes a specific Azure resource to assess its health status, diagnose potential issues using logs and telemetry data, and develop a comprehensive remediation plan for any problems discovered.\n\n## Prerequisites\n- Azure MCP server configured and authenticated\n- Target Azure resource identified (name and optionally resource group/subscription)\n- Resource must be deployed and running to generate logs/telemetry\n- Prefer Azure MCP tools (`azmcp-*`) over direct Azure CLI when available\n\n## Workflow Steps\n\n### Step 1: Get Azure Best Practices\n**Action**: Retrieve diagnostic and troubleshooting best practices\n**Tools**: Azure MCP best practices tool\n**Process**:\n1. **Load Best Practices**:\n   - Execute Azure best practices tool to get diagnostic guidelines\n   - Focus on health monitoring, log analysis, and issue resolution patterns\n   - Use these practices to inform diagnostic approach and remediation recommendations\n\n### Step 2: Resource Discovery & Identification\n**Action**: Locate and identify the target Azure resource\n**Tools**: Azure MCP tools + Azure CLI fallback\n**Process**:\n1. **Resource Lookup**:\n   - If only resource name provided: Search across subscriptions using `azmcp-subscription-list`\n   - Use `az resource list --name <resource-name>` to find matching resources\n   - If multiple matches found, prompt user to specify subscription/resource group\n   - Gather detailed resource information:\n     - Resource type and current status\n     - Location, tags, and configuration\n     - Associated services and dependencies\n\n2. **Resource Type Detection**:\n   - Identify resource type to determine appropriate diagnostic approach:\n     - **Web Apps/Function Apps**: Application logs, performance metrics, dependency tracking\n     - **Virtual Machines**: System logs, performance counters, boot diagnostics\n     - **Cosmos DB**: Request metrics, throttling, partition statistics\n     - **Storage Accounts**: Access logs, performance metrics, availability\n     - **SQL Database**: Query performance, connection logs, resource utilization\n     - **Application Insights**: Application telemetry, exceptions, dependencies\n     - **Key Vault**: Access logs, certificate status, secret usage\n     - **Service Bus**: Message metrics, dead letter queues, throughput\n\n### Step 3: Health Status Assessment\n**Action**: Evaluate current resource health and availability\n**Tools**: Azure MCP monitoring tools + Azure CLI\n**Process**:\n1. **Basic Health Check**:\n   - Check resource provisioning state and operational status\n   - Verify service availability and responsiveness\n   - Review recent deployment or configuration changes\n   - Assess current resource utilization (CPU, memory, storage, etc.)\n\n2. **Service-Specific Health Indicators**:\n   - **Web Apps**: HTTP response codes, response times, uptime\n   - **Databases**: Connection success rate, query performance, deadlocks\n   - **Storage**: Availability percentage, request success rate, latency\n   - **VMs**: Boot diagnostics, guest OS metrics, network connectivity\n   - **Functions**: Execution success rate, duration, error frequency\n\n### Step 4: Log & Telemetry Analysis\n**Action**: Analyze logs and telemetry to identify issues and patterns\n**Tools**: Azure MCP monitoring tools for Log Analytics queries\n**Process**:\n1. **Find Monitoring Sources**:\n   - Use `azmcp-monitor-workspace-list` to identify Log Analytics workspaces\n   - Locate Application Insights instances associated with the resource\n   - Identify relevant log tables using `azmcp-monitor-table-list`\n\n2. **Execute Diagnostic Queries**:\n   Use `azmcp-monitor-log-query` with targeted KQL queries based on resource type:\n\n   **General Error Analysis**:\n   ```kql\n   // Recent errors and exceptions\n   union isfuzzy=true \n       AzureDiagnostics,\n       AppServiceHTTPLogs,\n       AppServiceAppLogs,\n       AzureActivity\n   | where TimeGenerated > ago(24h)\n   | where Level == \"Error\" or ResultType != \"Success\"\n   | summarize ErrorCount=count() by Resource, ResultType, bin(TimeGenerated, 1h)\n   | order by TimeGenerated desc\n   ```\n\n   **Performance Analysis**:\n   ```kql\n   // Performance degradation patterns\n   Perf\n   | where TimeGenerated > ago(7d)\n   | where ObjectName == \"Processor\" and CounterName == \"% Processor Time\"\n   | summarize avg(CounterValue) by Computer, bin(TimeGenerated, 1h)\n   | where avg_CounterValue > 80\n   ```\n\n   **Application-Specific Queries**:\n   ```kql\n   // Application Insights - Failed requests\n   requests\n   | where timestamp > ago(24h)\n   | where success == false\n   | summarize FailureCount=count() by resultCode, bin(timestamp, 1h)\n   | order by timestamp desc\n   \n   // Database - Connection failures\n   AzureDiagnostics\n   | where ResourceProvider == \"MICROSOFT.SQL\"\n   | where Category == \"SQLSecurityAuditEvents\"\n   | where action_name_s == \"CONNECTION_FAILED\"\n   | summarize ConnectionFailures=count() by bin(TimeGenerated, 1h)\n   ```\n\n3. **Pattern Recognition**:\n   - Identify recurring error patterns or anomalies\n   - Correlate errors with deployment times or configuration changes\n   - Analyze performance trends and degradation patterns\n   - Look for dependency failures or external service issues\n\n### Step 5: Issue Classification & Root Cause Analysis\n**Action**: Categorize identified issues and determine root causes\n**Process**:\n1. **Issue Classification**:\n   - **Critical**: Service unavailable, data loss, security breaches\n   - **High**: Performance degradation, intermittent failures, high error rates\n   - **Medium**: Warnings, suboptimal configuration, minor performance issues\n   - **Low**: Informational alerts, optimization opportunities\n\n2. **Root Cause Analysis**:\n   - **Configuration Issues**: Incorrect settings, missing dependencies\n   - **Resource Constraints**: CPU/memory/disk limitations, throttling\n   - **Network Issues**: Connectivity problems, DNS resolution, firewall rules\n   - **Application Issues**: Code bugs, memory leaks, inefficient queries\n   - **External Dependencies**: Third-party service failures, API limits\n   - **Security Issues**: Authentication failures, certificate expiration\n\n3. **Impact Assessment**:\n   - Determine business impact and affected users/systems\n   - Evaluate data integrity and security implications\n   - Assess recovery time objectives and priorities\n\n### Step 6: Generate Remediation Plan\n**Action**: Create a comprehensive plan to address identified issues\n**Process**:\n1. **Immediate Actions** (Critical issues):\n   - Emergency fixes to restore service availability\n   - Temporary workarounds to mitigate impact\n   - Escalation procedures for complex issues\n\n2. **Short-term Fixes** (High/Medium issues):\n   - Configuration adjustments and resource scaling\n   - Application updates and patches\n   - Monitoring and alerting improvements\n\n3. **Long-term Improvements** (All issues):\n   - Architectural changes for better resilience\n   - Preventive measures and monitoring enhancements\n   - Documentation and process improvements\n\n4. **Implementation Steps**:\n   - Prioritized action items with specific Azure CLI commands\n   - Testing and validation procedures\n   - Rollback plans for each change\n   - Monitoring to verify issue resolution\n\n### Step 7: User Confirmation & Report Generation\n**Action**: Present findings and get approval for remediation actions\n**Process**:\n1. **Display Health Assessment Summary**:\n   ```\n   🏥 Azure Resource Health Assessment\n   \n   📊 Resource Overview:\n   • Resource: [Name] ([Type])\n   • Status: [Healthy/Warning/Critical]\n   • Location: [Region]\n   • Last Analyzed: [Timestamp]\n   \n   🚨 Issues Identified:\n   • Critical: X issues requiring immediate attention\n   • High: Y issues affecting performance/reliability  \n   • Medium: Z issues for optimization\n   • Low: N informational items\n   \n   🔍 Top Issues:\n   1. [Issue Type]: [Description] - Impact: [High/Medium/Low]\n   2. [Issue Type]: [Description] - Impact: [High/Medium/Low]\n   3. [Issue Type]: [Description] - Impact: [High/Medium/Low]\n   \n   🛠️ Remediation Plan:\n   • Immediate Actions: X items\n   • Short-term Fixes: Y items  \n   • Long-term Improvements: Z items\n   • Estimated Resolution Time: [Timeline]\n   \n   ❓ Proceed with detailed remediation plan? (y/n)\n   ```\n\n2. **Generate Detailed Report**:\n   ```markdown\n   # Azure Resource Health Report: [Resource Name]\n   \n   **Generated**: [Timestamp]  \n   **Resource**: [Full Resource ID]  \n   **Overall Health**: [Status with color indicator]\n   \n   ## 🔍 Executive Summary\n   [Brief overview of health status and key findings]\n   \n   ## 📊 Health Metrics\n   - **Availability**: X% over last 24h\n   - **Performance**: [Average response time/throughput]\n   - **Error Rate**: X% over last 24h\n   - **Resource Utilization**: [CPU/Memory/Storage percentages]\n   \n   ## 🚨 Issues Identified\n   \n   ### Critical Issues\n   - **[Issue 1]**: [Description]\n     - **Root Cause**: [Analysis]\n     - **Impact**: [Business impact]\n     - **Immediate Action**: [Required steps]\n   \n   ### High Priority Issues  \n   - **[Issue 2]**: [Description]\n     - **Root Cause**: [Analysis]\n     - **Impact**: [Performance/reliability impact]\n     - **Recommended Fix**: [Solution steps]\n   \n   ## 🛠️ Remediation Plan\n   \n   ### Phase 1: Immediate Actions (0-2 hours)\n   ```bash\n   # Critical fixes to restore service\n   [Azure CLI commands with explanations]\n   ```\n   \n   ### Phase 2: Short-term Fixes (2-24 hours)\n   ```bash\n   # Performance and reliability improvements\n   [Azure CLI commands with explanations]\n   ```\n   \n   ### Phase 3: Long-term Improvements (1-4 weeks)\n   ```bash\n   # Architectural and preventive measures\n   [Azure CLI commands and configuration changes]\n   ```\n   \n   ## 📈 Monitoring Recommendations\n   - **Alerts to Configure**: [List of recommended alerts]\n   - **Dashboards to Create**: [Monitoring dashboard suggestions]\n   - **Regular Health Checks**: [Recommended frequency and scope]\n   \n   ## ✅ Validation Steps\n   - [ ] Verify issue resolution through logs\n   - [ ] Confirm performance improvements\n   - [ ] Test application functionality\n   - [ ] Update monitoring and alerting\n   - [ ] Document lessons learned\n   \n   ## 📝 Prevention Measures\n   - [Recommendations to prevent similar issues]\n   - [Process improvements]\n   - [Monitoring enhancements]\n   ```\n\n## Error Handling\n- **Resource Not Found**: Provide guidance on resource name/location specification\n- **Authentication Issues**: Guide user through Azure authentication setup\n- **Insufficient Permissions**: List required RBAC roles for resource access\n- **No Logs Available**: Suggest enabling diagnostic settings and waiting for data\n- **Query Timeouts**: Break down analysis into smaller time windows\n- **Service-Specific Issues**: Provide generic health assessment with limitations noted\n\n## Success Criteria\n- ✅ Resource health status accurately assessed\n- ✅ All significant issues identified and categorized\n- ✅ Root cause analysis completed for major problems\n- ✅ Actionable remediation plan with specific steps provided\n- ✅ Monitoring and prevention recommendations included\n- ✅ Clear prioritization of issues by business impact\n- ✅ Implementation steps include validation and rollback procedures","tags":["azure","resource","health","diagnose","awesome","copilot","github"],"capabilities":["skill","source-github","category-awesome-copilot"],"categories":["awesome-copilot"],"synonyms":[],"warnings":[],"endpointUrl":"https://skills.sh/github/awesome-copilot/azure-resource-health-diagnose","protocol":"skill","transport":"skills-sh","auth":{"type":"none","details":{"install_from":"skills.sh"}},"qualityScore":"0.300","qualityRationale":"deterministic score 0.30 from registry signals: · indexed on skills.sh · published under github/awesome-copilot","verified":false,"liveness":"unknown","lastLivenessCheck":null,"agentReviews":{"count":0,"score_avg":null,"cost_usd_avg":null,"success_rate":null,"latency_p50_ms":null,"narrative_summary":null,"summary_updated_at":null},"enrichmentModel":"deterministic:skill:v1","enrichmentVersion":1,"enrichedAt":"2026-04-22T08:40:12.570Z","embedding":null,"createdAt":"2026-04-18T20:26:14.465Z","updatedAt":"2026-04-22T08:40:12.570Z","lastSeenAt":"2026-04-22T08:40:12.570Z","tsv":"'-2':1018 '-24':1038 '-4':1057 '0':1017 '1':84,103,157,302,400,618,730,833,878,983,1014,1056 '1h':484,514,543,570 '2':137,209,332,433,648,751,884,924,999,1032,1037 '24h':469,532,963,973 '3':283,571,694,771,890,1051 '4':376,792 '5':603 '6':716 '7':818 '7d':499 '80':518 'access':247,268,1150 'account':246 'accur':1187 'across':166 'action':89,141,287,380,559,609,720,732,796,823,831,899,992,1016,1202 'address':726 'adjust':759 'affect':701,865 'ago':468,498,531 'alert':645,769,1072,1078,1108 'analysi':121,379,453,490,608,651,987,1003,1166,1197 'analyt':397,413 'analyz':17,381,588,852 'anomali':579 'api':686 'app':223,339 'applic':224,260,262,416,520,524,671,763,1103 'application-specif':519 'approach':132,220 'appropri':218 'approv':828 'apps/function':222 'appserviceapplog':464 'appservicehttplog':463 'architectur':778,1060 'assess':23,286,324,696,709,836,841,1178,1188 'associ':205,419 'attent':861 'authent':51,690,1134,1140 'avail':80,251,293,315,354,740,959,1153 'averag':965 'avg':508,516 'awesom':5 'az':174 'azmcp':74,170,406,429,439 'azmcp-monitor-log-queri':438 'azmcp-monitor-table-list':428 'azmcp-monitor-workspace-list':405 'azmcp-subscription-list':169 'azur':1,10,20,46,53,71,77,86,97,108,147,150,153,295,299,391,800,838,929,1026,1045,1064,1139 'azureact':465 'azurediagnost':462,551 'base':447 'bash':1020,1040,1059 'basic':303 'best':87,94,99,105,109 'better':781 'bin':482,512,541,568 'boot':236,361 'breach':627 'break':1164 'brief':949 'bug':674 'bus':275 'busi':698,989,1219 'categor':610,1194 'categori':556 'category-awesome-copilot' 'caus':607,616,650,986,1002,1196 'certif':270,692 'chang':323,587,779,811,1069 'check':305,306,1087 'classif':605,620 'clear':1214 'cli':78,154,300,801,1027,1046,1065 'code':342,673 'color':945 'command':802,1028,1047,1066 'complet':1198 'complex':749 'comprehens':38,723 'comput':511 'configur':49,204,322,586,639,652,758,1068,1074 'confirm':820,1099 'connect':256,347,367,549,562,665 'connectionfailur':565 'constraint':659 'copilot':6 'correl':580 'cosmos':238 'count':478,538,566 'counter':235 'counternam':504 'countervalu':509,517 'cpu':328 'cpu/memory/disk':660 'cpu/memory/storage':976 'creat':721,1081 'criteria':1183 'critic':621,733,856,980,1021 'current':199,289,325 'dashboard':1079,1083 'data':34,624,704,1161 'databas':253,346,548 'db':239 'dead':278 'deadlock':352 'degrad':493,592,630 'depend':208,228,265,596,657,680 'deploy':64,320,583 'desc':488,547 'descript':881,887,893,984,1000 'detail':193,920,926 'detect':212 'determin':217,614,697 'develop':36 'diagnos':4,27 'diagnosi':14 'diagnost':91,114,131,219,237,362,435,1156 'direct':76 'discov':44 'discoveri':139 'display':834 'dns':667 'document':788,1109 'durat':372 'emerg':735 'enabl':1155 'enhanc':787,1122 'error':373,452,456,472,576,581,634,968,1123 'errorcount':477 'escal':746 'estim':914 'etc':331 'evalu':288,703 'except':264,458 'execut':107,369,434,947 'expir':693 'explan':1030,1049 'extern':599,679 'fail':526,563 'failur':550,597,632,685,691 'failurecount':537 'fallback':155 'fals':535 'find':179,401,825,956 'firewal':669 'fix':736,755,905,1008,1022,1036 'focus':116 'found':185,1127 'frequenc':374,1089 'full':938 'function':368,1104 'gather':192 'general':451 'generat':68,717,822,925,935 'generic':1176 'get':85,113,827 'github':9 'group':191 'group/subscription':60 'guest':363 'guid':1136 'guidanc':1129 'guidelin':115 'handl':1124 'health':3,12,25,118,284,291,304,336,835,840,931,942,952,957,1086,1177,1185 'healthy/warning/critical':848 'high':628,633,862,995 'high/medium':756 'high/medium/low':883,889,895 'hour':1019,1039 'http':340 'id':940 'identif':140 'identifi':55,144,213,386,411,423,574,611,727,855,979,1192 'immedi':731,860,898,991,1015 'impact':695,699,745,882,888,894,988,990,1004,1006,1220 'implement':793,1221 'implic':708 'improv':770,775,791,911,1044,1055,1101,1120 'includ':1213,1223 'incorrect':654 'indic':337,946 'ineffici':677 'inform':130,195,644,874 'insight':261,417,525 'instanc':418 'insuffici':1142 'integr':705 'intermitt':631 'isfuzzi':460 'issu':13,29,123,387,601,604,612,619,642,653,664,672,689,728,734,750,757,777,815,854,858,864,869,877,879,885,891,978,981,982,997,998,1095,1118,1135,1174,1191,1217 'item':797,875,901,907,913 'key':266,955 'kql':445,454,491,523 'last':851,962,972 'latenc':359 'leak':676 'learn':1111 'lesson':1110 'letter':279 'level':471 'limit':661,687,1180 'list':172,176,409,432,1075,1144 'load':104 'locat':142,201,415,849 'log':31,120,225,233,248,257,269,377,382,396,412,425,441,1098,1152 'logs/telemetry':69 'long':773,909,1053 'long-term':772,908,1052 'look':594 'lookup':159 'loss':625 'low':643,872 'machin':231 'major':1200 'markdown':928 'match':180,184 'mcp':47,72,98,151,296,392 'measur':784,1063,1113 'medium':636,867 'memori':329,675 'messag':276 'metric':227,241,250,277,365,958 'microsoft.sql':554 'minor':640 'miss':656 'mitig':744 'monitor':119,297,393,402,407,430,440,767,786,812,1070,1082,1106,1121,1209 'multipl':183 'must':62 'n':873 'name':56,163,177,560,845,934 'name/location':1132 'network':366,663 'note':1181 'object':712 'objectnam':501 'oper':311 'opportun':647 'optim':646,871 'option':58 'order':485,544 'os':364 'overal':941 'overview':843,950 'parti':683 'partit':243 'patch':766 'pattern':125,389,494,572,577,593 'percentag':355,977 'perf':495 'perform':226,234,249,255,351,489,492,589,629,641,964,1041,1100 'performance/reliability':866,1005 'permiss':1143 'phase':1013,1031,1050 'plan':40,719,724,808,897,922,1012,1204 'potenti':28 'practic':88,95,100,106,110,128 'prefer':70 'prerequisit':45 'present':824 'prevent':783,1062,1112,1116,1211 'priorit':795,1215 'prioriti':714,996 'problem':43,666,1201 'procedur':747,806,1227 'proceed':918 'process':102,156,301,399,617,729,790,832,1119 'processor':502,505 'prompt':186 'provid':164,1128,1175,1208 'provis':308 'queri':254,350,398,436,442,446,522,678,1162 'queue':280 'rate':349,358,371,635,969 'rbac':1146 'recent':319,455 'recognit':573 'recommend':135,1007,1071,1077,1088,1114,1212 'recoveri':710 'recur':575 'region':850 'regular':1085 'relev':424 'reliabl':1043 'remedi':39,134,718,830,896,921,1011,1203 'report':821,927,932 'request':240,356,527,528 'requir':859,993,1145 'resili':782 'resolut':124,668,816,915,1096 'resourc':2,11,21,54,59,61,138,148,158,162,175,181,194,196,210,214,258,290,307,326,422,449,480,658,761,839,842,844,930,933,937,939,974,1125,1131,1149,1184 'resourceprovid':553 'respons':317,341,343,966 'restor':738,1024 'resultcod':540 'resulttyp':474,481 'retriev':90 'review':318 'role':1147 'rollback':807,1226 'root':606,615,649,985,1001,1195 'rule':670 'run':66 'scale':762 'scope':1091 'search':165 'secret':272 'secur':626,688,707 'server':48 'servic':206,274,314,334,600,622,684,739,1025,1172 'service-specif':333,1171 'set':655,1157 'setup':1141 'short':753,903,1034 'short-term':752,902,1033 'signific':1190 'similar':1117 'skill':7 'smaller':1168 'solut':1009 'sourc':403 'source-github' 'specif':19,335,521,799,1133,1173,1206 'specifi':189 'sql':252 'sqlsecurityauditev':557 'state':309 'statist':244 'status':26,200,271,285,312,847,943,953,1186 'step':82,83,136,282,375,602,715,794,817,994,1010,1093,1207,1222 'storag':245,330,353 'suboptim':638 'subscript':167,171 'subscription/resource':190 'success':348,357,370,475,534,1182 'suggest':1084,1154 'summar':476,507,536,564 'summari':837,948 'system':232 'tabl':426,431 'tag':202 'target':52,146,444 'telemetri':33,263,378,384 'temporari':741 'term':754,774,904,910,1035,1054 'test':803,1102 'third':682 'third-parti':681 'throttl':242,662 'throughput':281 'time':344,506,584,711,916,1169 'time/throughput':967 'timegener':467,483,487,497,513,569 'timelin':917 'timeout':1163 'timestamp':530,542,546,853,936 'tool':73,96,101,111,149,152,294,298,390,394 'top':876 'track':229 'trend':590 'troubleshoot':93 'true':461 'type':197,211,215,450,846,880,886,892 'unavail':623 'union':459 'updat':764,1105 'uptim':345 'usag':273 'use':30,126,168,173,404,427,437 'user':187,819,1137 'users/systems':702 'util':259,327,975 'valid':805,1092,1224 'vault':267 'verifi':313,814,1094 'virtual':230 'vms':360 'wait':1159 'warn':637 'web':221,338 'week':1058 'window':1170 'workaround':742 'workflow':16,81 'workspac':408,414 'x':857,900,960,970 'y':863,906 'y/n':923 'z':868,912","prices":[{"id":"7e3f437a-f805-4eb6-a068-f5880d5afa78","listingId":"499fe22c-0f0f-48d6-aec5-1e71c89941dd","amountUsd":"0","unit":"free","nativeCurrency":null,"nativeAmount":null,"chain":null,"payTo":null,"paymentMethod":"skill-free","isPrimary":true,"details":{"org":"github","category":"awesome-copilot","install_from":"skills.sh"},"createdAt":"2026-04-18T20:26:14.465Z"}],"sources":[{"listingId":"499fe22c-0f0f-48d6-aec5-1e71c89941dd","source":"github","sourceId":"github/awesome-copilot/azure-resource-health-diagnose","sourceUrl":"https://github.com/github/awesome-copilot/tree/main/skills/azure-resource-health-diagnose","isPrimary":false,"firstSeenAt":"2026-04-18T21:48:25.399Z","lastSeenAt":"2026-04-22T06:52:15.475Z"},{"listingId":"499fe22c-0f0f-48d6-aec5-1e71c89941dd","source":"skills_sh","sourceId":"github/awesome-copilot/azure-resource-health-diagnose","sourceUrl":"https://skills.sh/github/awesome-copilot/azure-resource-health-diagnose","isPrimary":true,"firstSeenAt":"2026-04-18T20:26:14.465Z","lastSeenAt":"2026-04-22T08:40:12.570Z"}],"details":{"listingId":"499fe22c-0f0f-48d6-aec5-1e71c89941dd","quickStartSnippet":null,"exampleRequest":null,"exampleResponse":null,"schema":null,"openapiUrl":null,"agentsTxtUrl":null,"citations":[],"useCases":[],"bestFor":[],"notFor":[],"kindDetails":{"org":"github","slug":"azure-resource-health-diagnose","source":"skills_sh","category":"awesome-copilot","skills_sh_url":"https://skills.sh/github/awesome-copilot/azure-resource-health-diagnose"},"updatedAt":"2026-04-22T08:40:12.570Z"}}