{"id":"60d9ea8e-3dc2-4bfd-999e-3e6cc3dd6b52","shortId":"La6erq","kind":"skill","title":"spark-history-cli","tagline":"Query a running Apache Spark History Server from Copilot CLI. Use this whenever the user wants to inspect SHS applications, jobs, stages, executors, SQL executions, environment details, or event logs, especially when they mention Spark History Server, SHS, event log history, benc","description":"# spark-history-cli\n\nUse this skill when the task is about exploring or debugging data exposed by a running Apache Spark History Server.\n\n## Installation\n\n```bash\npip install spark-history-cli\n```\n\nOr if not on PATH after install:\n\n```bash\npython -m spark_history_cli --json apps\n```\n\n## Why use this skill\n\n- It gives you a purpose-built CLI instead of scraping the Spark History Server web UI.\n- It wraps the REST API cleanly and already handles attempt-ID resolution for multi-attempt apps.\n- It supports `--json`, which makes downstream reasoning and comparisons much easier.\n\n## Workflow\n\n1. Prefer the CLI over raw REST calls.\n2. Prefer `--json` unless the user explicitly wants a human-formatted table.\n3. Use `--server <url>` or `SPARK_HISTORY_SERVER` to point at the right SHS. If the user does not specify one, assume `http://localhost:18080`.\n4. Start broad, then drill down:\n   - list applications\n   - choose the relevant app\n   - inspect jobs, stages, executors, SQL executions, environment, or logs\n5. If the user says \"latest app\", \"recent run\", or similar, list apps first and choose the most relevant recent application before continuing.\n6. If the CLI is unavailable, install it with `python -m pip install spark-history-cli` if tool permissions allow it.\n\n## Command patterns\n\n```bash\nspark-history-cli --json --server http://localhost:18080 apps\nspark-history-cli --json --server http://localhost:18080 app <app-id>\nspark-history-cli --json --server http://localhost:18080 --app-id <app-id> jobs\nspark-history-cli --json --server http://localhost:18080 --app-id <app-id> stages\nspark-history-cli --json --server http://localhost:18080 --app-id <app-id> executors --all\nspark-history-cli --json --server http://localhost:18080 --app-id <app-id> sql\nspark-history-cli --json --server http://localhost:18080 --app-id <app-id> sql-plan <exec-id> --view final\nspark-history-cli --server http://localhost:18080 --app-id <app-id> sql-plan <exec-id> --dot -o plan.dot\nspark-history-cli --json --server http://localhost:18080 --app-id <app-id> sql-jobs <exec-id>\nspark-history-cli --json --server http://localhost:18080 --app-id <app-id> summary\nspark-history-cli --json --server http://localhost:18080 --app-id <app-id> env\nspark-history-cli --server http://localhost:18080 --app-id <app-id> logs output.zip\n```\n\nIf `spark-history-cli` is not on `PATH`, use:\n\n```bash\npython -m spark_history_cli --json apps\n```\n\n## What to reach for\n\n- `apps` for recent runs, durations, status, and picking candidates\n- `app <id>` for high-level details about one run\n- `attempts` for multi-attempt apps (list or show specific attempt details)\n- `jobs`, `job <id>` for job-level failures or progress\n- `job-stages <id>` for stages belonging to a job\n- `stages`, `stage <id>` for task/stage bottlenecks\n- `stage-summary <id>` for task metric quantiles (p5/p25/p50/p75/p95) — duration, GC, memory, shuffle, I/O\n- `stage-tasks <id>` for individual task details — sorted by runtime to find stragglers\n- `executors --all` for executor churn or skew investigations\n- `sql` for SQL execution history and plan graph data\n- `sql-plan <id>` for SQL plan extraction:\n  - `--view full` (default): full plan text\n  - `--view initial`: only the Initial Plan (pre-AQE)\n  - `--view final`: only the Final Plan (post-AQE)\n  - `--dot`: Graphviz DOT output for visualizing the plan DAG\n  - `--json` + `--view`: structured JSON with `isAdaptive`, `sectionCount`, `plan`, and `sections`\n  - `-o <file>`: write output to file instead of stdout\n- `sql-jobs <id>` for jobs associated with a SQL execution (fetches all linked jobs by ID)\n- `summary` for a concise application overview: app info, resource config (driver/executor/shuffle), and workload stats (jobs/stages/tasks/SQL)\n- `env` for Spark config/runtime context\n- `logs` only when the user explicitly wants the event log archive saved locally\n\n## Practical guidance\n\n- Preserve the user's server URL if they gave one explicitly.\n- Summarize findings after retrieving JSON; do not dump raw JSON unless the user asked for it.\n- Treat event logs and benchmark history as potentially sensitive. Download them only when necessary and keep them local.\n- This CLI needs a running Spark History Server. It does not replace SHS and it does not parse raw event logs directly.\n\n## Troubleshooting\n\n| Issue | Solution |\n|-------|----------|\n| `Connection refused` | SHS not running — start with `$SPARK_HOME/sbin/start-history-server.sh` |\n| `404 Not Found` on app | App ID may include attempt suffix — use `apps` to list valid IDs |\n| No apps listed | Check `spark.history.fs.logDirectory` points to the right event log path |\n| `ModuleNotFoundError` | CLI not installed — run `pip install spark-history-cli` |\n| Wrong server | Set `SPARK_HISTORY_SERVER` env var or use `--server <url>` |\n| Timeout on large apps | SHS may be parsing event logs — wait and retry, or check SHS logs |","tags":["spark","history","cli","yaooqinn","agent-skills","benchmark","diagnostics","gluten","performance","spark-history-server","tpc-ds","velox"],"capabilities":["skill","source-yaooqinn","skill-spark-history-cli","topic-agent-skills","topic-benchmark","topic-cli","topic-diagnostics","topic-gluten","topic-performance","topic-spark","topic-spark-history-server","topic-tpc-ds","topic-velox"],"categories":["spark-history-cli"],"synonyms":[],"warnings":[],"endpointUrl":"https://skills.sh/yaooqinn/spark-history-cli/spark-history-cli","protocol":"skill","transport":"skills-sh","auth":{"type":"none","details":{"cli":"npx skills add yaooqinn/spark-history-cli","source_repo":"https://github.com/yaooqinn/spark-history-cli","install_from":"skills.sh"}},"qualityScore":"0.461","qualityRationale":"deterministic score 0.46 from registry signals: · indexed on github topic:agent-skills · 22 github stars · SKILL.md body (4,905 chars)","verified":false,"liveness":"unknown","lastLivenessCheck":null,"agentReviews":{"count":0,"score_avg":null,"cost_usd_avg":null,"success_rate":null,"latency_p50_ms":null,"narrative_summary":null,"summary_updated_at":null},"enrichmentModel":"deterministic:skill-github:v1","enrichmentVersion":1,"enrichedAt":"2026-04-23T13:02:30.930Z","embedding":null,"createdAt":"2026-04-18T23:05:37.642Z","updatedAt":"2026-04-23T13:02:30.930Z","lastSeenAt":"2026-04-23T13:02:30.930Z","tsv":"'1':145 '18080':188,265,274,283,295,307,320,332,347,364,378,390,401 '2':153 '3':166 '4':189 '404':713 '5':210 '6':233 'allow':253 'alreadi':122 'apach':8,67 'api':119 'app':93,132,200,216,222,266,275,285,297,309,322,334,349,366,380,392,403,424,429,438,452,605,717,718,725,731,767 'app-id':284,296,308,321,333,348,365,379,391,402 'applic':24,196,230,603 'aqe':546,555 'archiv':629 'ask':658 'associ':588 'assum':186 'attempt':125,131,447,451,457,722 'attempt-id':124 'bash':72,86,257,417 'belong':473 'benc':46 'benchmark':665 'bottleneck':481 'broad':191 'built':104 'call':152 'candid':437 'check':733,778 'choos':197,225 'churn':512 'clean':120 'cli':4,14,50,78,91,105,148,236,249,261,270,279,291,303,316,328,344,360,374,386,398,411,422,680,743,752 'command':255 'comparison':141 'concis':602 'config':608 'config/runtime':617 'connect':704 'context':618 'continu':232 'copilot':13 'dag':564 'data':62,524 'debug':61 'default':534 'detail':31,443,458,501 'direct':700 'dot':354,556,558 'download':670 'downstream':138 'drill':193 'driver/executor/shuffle':609 'dump':652 'durat':433,490 'easier':143 'env':394,614,759 'environ':30,207 'especi':35 'event':33,43,627,662,698,739,772 'execut':29,206,519,592 'executor':27,204,311,508,511 'explicit':159,624,644 'explor':59 'expos':63 'extract':531 'failur':465 'fetch':593 'file':579 'final':340,548,551 'find':506,646 'first':223 'format':164 'found':715 'full':533,535 'gave':642 'gc':491 'give':99 'graph':523 'graphviz':557 'guidanc':633 'handl':123 'high':441 'high-level':440 'histori':3,10,40,45,49,69,77,90,111,171,248,260,269,278,290,302,315,327,343,359,373,385,397,410,421,520,666,685,751,757 'home/sbin/start-history-server.sh':712 'human':163 'human-format':162 'i/o':494 'id':126,286,298,310,323,335,350,367,381,393,404,598,719,729 'includ':721 'individu':499 'info':606 'initi':539,542 'inspect':22,201 'instal':71,74,85,239,245,745,748 'instead':106,580 'investig':515 'isadapt':570 'issu':702 'job':25,202,287,370,459,460,463,469,476,585,587,596 'job-level':462 'job-stag':468 'jobs/stages/tasks/sql':613 'json':92,135,155,262,271,280,292,304,317,329,361,375,387,423,565,568,649,654 'keep':676 'larg':766 'latest':215 'level':442,464 'link':595 'list':195,221,453,727,732 'local':631,678 'localhost':187,264,273,282,294,306,319,331,346,363,377,389,400 'log':34,44,209,405,619,628,663,699,740,773,780 'm':88,243,419 'make':137 'may':720,769 'memori':492 'mention':38 'metric':487 'modulenotfounderror':742 'much':142 'multi':130,450 'multi-attempt':129,449 'necessari':674 'need':681 'o':355,575 'one':185,445,643 'output':559,577 'output.zip':406 'overview':604 'p5/p25/p50/p75/p95':489 'pars':696,771 'path':83,415,741 'pattern':256 'permiss':252 'pick':436 'pip':73,244,747 'plan':338,353,522,527,530,536,543,552,563,572 'plan.dot':356 'point':174,735 'post':554 'post-aq':553 'potenti':668 'practic':632 'pre':545 'pre-aq':544 'prefer':146,154 'preserv':634 'progress':467 'purpos':103 'purpose-built':102 'python':87,242,418 'quantil':488 'queri':5 'raw':150,653,697 'reach':427 'reason':139 'recent':217,229,431 'refus':705 'relev':199,228 'replac':690 'resolut':127 'resourc':607 'rest':118,151 'retri':776 'retriev':648 'right':177,738 'run':7,66,218,432,446,683,708,746 'runtim':504 'save':630 'say':214 'scrape':108 'section':574 'sectioncount':571 'sensit':669 'server':11,41,70,112,168,172,263,272,281,293,305,318,330,345,362,376,388,399,638,686,754,758,763 'set':755 'show':455 'shs':23,42,178,691,706,768,779 'shuffl':493 'similar':220 'skew':514 'skill':53,97 'skill-spark-history-cli' 'solut':703 'sort':502 'source-yaooqinn' 'spark':2,9,39,48,68,76,89,110,170,247,259,268,277,289,301,314,326,342,358,372,384,396,409,420,616,684,711,750,756 'spark-history-c':1,47,75,246,258,267,276,288,300,313,325,341,357,371,383,395,408,749 'spark.history.fs.logdirectory':734 'specif':456 'specifi':184 'sql':28,205,324,337,352,369,516,518,526,529,584,591 'sql-job':368,583 'sql-plan':336,351,525 'stage':26,203,299,470,472,477,478,483,496 'stage-summari':482 'stage-task':495 'start':190,709 'stat':612 'status':434 'stdout':582 'straggler':507 'structur':567 'suffix':723 'summar':645 'summari':382,484,599 'support':134 'tabl':165 'task':56,486,497,500 'task/stage':480 'text':537 'timeout':764 'tool':251 'topic-agent-skills' 'topic-benchmark' 'topic-cli' 'topic-diagnostics' 'topic-gluten' 'topic-performance' 'topic-spark' 'topic-spark-history-server' 'topic-tpc-ds' 'topic-velox' 'treat':661 'troubleshoot':701 'ui':114 'unavail':238 'unless':156,655 'url':639 'use':15,51,95,167,416,724,762 'user':19,158,181,213,623,636,657 'valid':728 'var':760 'view':339,532,538,547,566 'visual':561 'wait':774 'want':20,160,625 'web':113 'whenev':17 'workflow':144 'workload':611 'wrap':116 'write':576 'wrong':753","prices":[{"id":"a15465c9-fefd-4191-b348-b03a4e0edc26","listingId":"60d9ea8e-3dc2-4bfd-999e-3e6cc3dd6b52","amountUsd":"0","unit":"free","nativeCurrency":null,"nativeAmount":null,"chain":null,"payTo":null,"paymentMethod":"skill-free","isPrimary":true,"details":{"org":"yaooqinn","category":"spark-history-cli","install_from":"skills.sh"},"createdAt":"2026-04-18T23:05:37.642Z"}],"sources":[{"listingId":"60d9ea8e-3dc2-4bfd-999e-3e6cc3dd6b52","source":"github","sourceId":"yaooqinn/spark-history-cli/spark-history-cli","sourceUrl":"https://github.com/yaooqinn/spark-history-cli/tree/main/skills/spark-history-cli","isPrimary":false,"firstSeenAt":"2026-04-18T23:05:37.642Z","lastSeenAt":"2026-04-23T13:02:30.930Z"}],"details":{"listingId":"60d9ea8e-3dc2-4bfd-999e-3e6cc3dd6b52","quickStartSnippet":null,"exampleRequest":null,"exampleResponse":null,"schema":null,"openapiUrl":null,"agentsTxtUrl":null,"citations":[],"useCases":[],"bestFor":[],"notFor":[],"kindDetails":{"org":"yaooqinn","slug":"spark-history-cli","github":{"repo":"yaooqinn/spark-history-cli","stars":22,"topics":["agent-skills","benchmark","cli","diagnostics","gluten","performance","spark","spark-history-server","tpc-ds","velox"],"license":"apache-2.0","html_url":"https://github.com/yaooqinn/spark-history-cli","pushed_at":"2026-03-22T17:59:03Z","description":"CLI tool for querying Apache Spark History Server REST API","skill_md_sha":"3ca07046e303dee744c21345754e611ff8067bbd","skill_md_path":"skills/spark-history-cli/SKILL.md","default_branch":"main","skill_tree_url":"https://github.com/yaooqinn/spark-history-cli/tree/main/skills/spark-history-cli"},"layout":"multi","source":"github","category":"spark-history-cli","frontmatter":{"name":"spark-history-cli","description":"Query a running Apache Spark History Server from Copilot CLI. Use this whenever the user wants to inspect SHS applications, jobs, stages, executors, SQL executions, environment details, or event logs, especially when they mention Spark History Server, SHS, event log history, benchmark runs, or application IDs.","compatibility":"Requires Python 3.10+, the spark-history-cli package, and network access to a running Spark History Server."},"skills_sh_url":"https://skills.sh/yaooqinn/spark-history-cli/spark-history-cli"},"updatedAt":"2026-04-23T13:02:30.930Z"}}