Skillquality 0.46
spark-history-cli
Query a running Apache Spark History Server from Copilot CLI. Use this whenever the user wants to inspect SHS applications, jobs, stages, executors, SQL executions, environment details, or event logs, especially when they mention Spark History Server, SHS, event log history, benc
Price
free
Protocol
skill
Verified
no
What it does
spark-history-cli
Use this skill when the task is about exploring or debugging data exposed by a running Apache Spark History Server.
Installation
pip install spark-history-cli
Or if not on PATH after install:
python -m spark_history_cli --json apps
Why use this skill
- It gives you a purpose-built CLI instead of scraping the Spark History Server web UI.
- It wraps the REST API cleanly and already handles attempt-ID resolution for multi-attempt apps.
- It supports
--json, which makes downstream reasoning and comparisons much easier.
Workflow
- Prefer the CLI over raw REST calls.
- Prefer
--jsonunless the user explicitly wants a human-formatted table. - Use
--server <url>orSPARK_HISTORY_SERVERto point at the right SHS. If the user does not specify one, assumehttp://localhost:18080. - Start broad, then drill down:
- list applications
- choose the relevant app
- inspect jobs, stages, executors, SQL executions, environment, or logs
- If the user says "latest app", "recent run", or similar, list apps first and choose the most relevant recent application before continuing.
- If the CLI is unavailable, install it with
python -m pip install spark-history-cliif tool permissions allow it.
Command patterns
spark-history-cli --json --server http://localhost:18080 apps
spark-history-cli --json --server http://localhost:18080 app <app-id>
spark-history-cli --json --server http://localhost:18080 --app-id <app-id> jobs
spark-history-cli --json --server http://localhost:18080 --app-id <app-id> stages
spark-history-cli --json --server http://localhost:18080 --app-id <app-id> executors --all
spark-history-cli --json --server http://localhost:18080 --app-id <app-id> sql
spark-history-cli --json --server http://localhost:18080 --app-id <app-id> sql-plan <exec-id> --view final
spark-history-cli --server http://localhost:18080 --app-id <app-id> sql-plan <exec-id> --dot -o plan.dot
spark-history-cli --json --server http://localhost:18080 --app-id <app-id> sql-jobs <exec-id>
spark-history-cli --json --server http://localhost:18080 --app-id <app-id> summary
spark-history-cli --json --server http://localhost:18080 --app-id <app-id> env
spark-history-cli --server http://localhost:18080 --app-id <app-id> logs output.zip
If spark-history-cli is not on PATH, use:
python -m spark_history_cli --json apps
What to reach for
appsfor recent runs, durations, status, and picking candidatesapp <id>for high-level details about one runattemptsfor multi-attempt apps (list or show specific attempt details)jobs,job <id>for job-level failures or progressjob-stages <id>for stages belonging to a jobstages,stage <id>for task/stage bottlenecksstage-summary <id>for task metric quantiles (p5/p25/p50/p75/p95) — duration, GC, memory, shuffle, I/Ostage-tasks <id>for individual task details — sorted by runtime to find stragglersexecutors --allfor executor churn or skew investigationssqlfor SQL execution history and plan graph datasql-plan <id>for SQL plan extraction:--view full(default): full plan text--view initial: only the Initial Plan (pre-AQE)--view final: only the Final Plan (post-AQE)--dot: Graphviz DOT output for visualizing the plan DAG--json+--view: structured JSON withisAdaptive,sectionCount,plan, andsections-o <file>: write output to file instead of stdout
sql-jobs <id>for jobs associated with a SQL execution (fetches all linked jobs by ID)summaryfor a concise application overview: app info, resource config (driver/executor/shuffle), and workload stats (jobs/stages/tasks/SQL)envfor Spark config/runtime contextlogsonly when the user explicitly wants the event log archive saved locally
Practical guidance
- Preserve the user's server URL if they gave one explicitly.
- Summarize findings after retrieving JSON; do not dump raw JSON unless the user asked for it.
- Treat event logs and benchmark history as potentially sensitive. Download them only when necessary and keep them local.
- This CLI needs a running Spark History Server. It does not replace SHS and it does not parse raw event logs directly.
Troubleshooting
| Issue | Solution |
|---|---|
Connection refused | SHS not running — start with $SPARK_HOME/sbin/start-history-server.sh |
404 Not Found on app | App ID may include attempt suffix — use apps to list valid IDs |
| No apps listed | Check spark.history.fs.logDirectory points to the right event log path |
ModuleNotFoundError | CLI not installed — run pip install spark-history-cli |
| Wrong server | Set SPARK_HISTORY_SERVER env var or use --server <url> |
| Timeout on large apps | SHS may be parsing event logs — wait and retry, or check SHS logs |
Capabilities
skillsource-yaooqinnskill-spark-history-clitopic-agent-skillstopic-benchmarktopic-clitopic-diagnosticstopic-glutentopic-performancetopic-sparktopic-spark-history-servertopic-tpc-dstopic-velox
Install
Installnpx skills add yaooqinn/spark-history-cli
Transportskills-sh
Protocolskill
Quality
0.46/ 1.00
deterministic score 0.46 from registry signals: · indexed on github topic:agent-skills · 22 github stars · SKILL.md body (4,905 chars)
Provenance
Indexed fromgithub
Enriched2026-04-23 07:00:58Z · deterministic:skill-github:v1 · v1
First seen2026-04-18
Last seen2026-04-23