Skillquality 0.47

usability-testing

Plan, run, and synthesize usability tests: test plan, tasks, script, findings, recommendations.

Price

free

Protocol

skill

Verified

Endpoint

https://skills.sh/liqiongyu/lenny_skills_plus/usability-testing

What it does

Usability Testing

Scope

Covers

Designing task-based usability studies tied to a specific product decision
Testing live flows, prototypes, and “faked” implementations (fake door, Wizard of Oz)
Running moderated sessions (remote or in-person) and capturing high-quality evidence
Turning findings into a prioritized fix list (including high-ROI microcopy/CTA improvements)

When to use

“Create a usability test plan and script for <flow>.”
“We need to test a prototype with 5–8 users next week.”
“Validate a value proposition before building (fake door / Wizard of Oz).”
“Help me synthesize usability findings into a prioritized backlog.”

When NOT to use

You need statistically reliable estimates or causal impact (use analytics/experimentation)
You need open-ended discovery (“what problems do users have?”) without a specific flow to evaluate (use conducting-user-interviews)
You need a design critique or heuristic review without live user sessions (use running-design-reviews)
You need to write specs or design docs for a feature, not test an existing flow (use writing-specs-designs)
You need to apply behavioral/persuasion design patterns to a flow (use behavioral-product-design); this skill evaluates usability, not designs behavioral nudges
You’re working with high-risk populations or sensitive topics (medical, legal, minors) without appropriate approvals/training
You don’t have a concrete scenario/flow to evaluate (clarify the decision first)

Inputs

Minimum required

Product + target user segment (who, context of use)
The decision this test should inform (what will change) + timeline
What you’re testing (flow/feature) + prototype/build link (or “recommend stimulus”)
Platform + environment (web/mobile/desktop; remote/in-person)
Constraints: session type, number of participants, incentives, recording policy, privacy constraints

Missing-info strategy

Ask up to 5 questions from references/INTAKE.md.
If still unknown, proceed with explicit assumptions and list Open questions that would change the plan.

Outputs (deliverables)

Produce a Usability Test Pack in Markdown (in-chat; or as files if requested):

Context snapshot (decision, users, what’s being tested, constraints)
Test plan (method, prototype strategy, hypotheses/risks, success criteria)
Participant plan (criteria, recruiting channels, schedule + backups)
Moderator guide + task script (neutral tasks, probes, wrap-up)
Note-taking template + issue log (severity/impact, evidence)
Synthesis readout (findings, prioritized issues, recommendations, quick wins)
Risks / Open questions / Next steps (always included)

Templates: references/TEMPLATES.md
Expanded heuristics: references/WORKFLOW.md

Workflow (8 steps)

1) Frame the decision and the “why now”

Inputs: User context; references/INTAKE.md.
Actions: Define the decision, primary unknowns, and the minimum you need to learn to make the call.
Outputs: Context snapshot + research questions/hypotheses.
Checks: You can answer: “What will we do differently after this test?”

2) Choose the right stimulus (real vs prototype vs faked)

Inputs: What’s being tested; constraints.
Actions: Select the cheapest valid setup: live product, clickable prototype, fake door, Wizard of Oz, or concierge flow.
Outputs: Prototype strategy + what will be real vs simulated.
Checks: The setup tests the core value/behavior (not pixel perfection).

3) Define tasks and success criteria (keep it neutral)

Inputs: User goals + scenarios.
Actions: Write 5–8 realistic tasks (each with a starting state), success criteria, and key observables (hesitation, errors, workarounds).
Outputs: Task list (draft) + observation plan.
Checks: Tasks don’t reveal UI labels (“Click the X button”); they reflect real intent.

4) Pick participants + recruiting plan (include buffers)

Inputs: Target segment, access to users.
Actions: Set inclusion/exclusion criteria; choose channels; build a schedule with backups and slack for no-shows and busy participants.
Outputs: Participant plan + recruiting copy/screener (as needed).
Checks: Participants match the scenario (behavior/context), not just demographics.

5) Build the moderator guide + instrumentation

Inputs: Task list + prototype.
Actions: Create the script (intro/consent, warm-up, tasks, probes, wrap-up). Assign note-taker roles; decide what to record.
Outputs: Moderator guide + notes template + issue log.
Checks: The guide avoids leading questions and includes “what would you do next?” probes.

6) Run sessions and capture evidence (optional “reality checks”)

Inputs: Guide, logistics, participants.
Actions: Run sessions; capture verbatims, errors, rough time-on-task, and moments of confusion. Optionally observe comparable flows “in the wild.”
Outputs: Completed notes per session + populated issue log.
Checks: Every issue has at least one concrete example (quote/screenshot/time/step) attached.

7) Synthesize into prioritized fixes (micro wins count)

Inputs: Notes + issue log.
Actions: Cluster issues; label severity and frequency; connect to funnel/business impact; propose fixes (including microcopy/CTA tweaks).
Outputs: Synthesis readout + prioritized recommendations/backlog.
Checks: Each recommendation ties to evidence and an expected impact (directional).

8) Share, decide, and run the quality gate

Inputs: Draft pack.
Actions: Produce a shareable readout, propose next steps (design iteration, follow-up test, experiment). Run references/CHECKLISTS.md and score references/RUBRIC.md.
Outputs: Final Usability Test Pack + Risks/Open questions/Next steps.
Checks: A stakeholder can make a “ship / fix / retest” decision asynchronously.

Quality gate (required)

Use references/CHECKLISTS.md and references/RUBRIC.md.
Always include: Risks, Open questions, Next steps.

Anti-patterns (common failure modes)

Task-label leakage — Writing tasks like “Click the Settings gear icon” instead of “Change your notification preferences.” Tasks should reflect user intent, not reveal UI labels or locations.
Happy-path-only testing — Only testing the golden path and missing error states, edge cases, and recovery flows. Include at least one task that tests what happens when things go wrong.
Moderator bias / leading — Helping participants when they struggle (“Try clicking there”) instead of letting them work through confusion. The struggle IS the data; document it, don’t fix it.
Over-indexing on opinions — Asking “Did you like it?” after each task instead of observing behavior. Post-task ratings are supplementary; observed friction, errors, and workarounds are the primary signal.
Severity-blind issue list — Listing all issues as equal without severity/frequency classification. A cosmetic label issue and a flow-blocking error require different urgency; classify every finding.

Examples

Example 1 (Prototype test): “Create a usability test plan + moderator guide to evaluate our new onboarding flow (web) with 6 first-time users next week.” Expected: full Usability Test Pack with neutral tasks, recruiting criteria, session logistics, and a synthesis structure.

Example 2 (Wizard of Oz): “We want to test an ‘AI auto-triage’ feature before building it. Design a Wizard of Oz usability test plan and script for 5 sessions.” Expected: stimulus plan defining what’s simulated, tasks focused on value, and an issue log + readout.

Boundary example (redirect to conducting-user-interviews): “We don’t have a prototype yet, but we want to understand what problems users face during onboarding.” Response: redirect to conducting-user-interviews for open-ended discovery; return here once you have a concrete flow or prototype to evaluate.

Boundary example (redirect to running-design-reviews): “Review our new checkout designs for usability issues without running user sessions.” Response: redirect to running-design-reviews for expert heuristic evaluation; this skill requires live user sessions with task-based observation.

Boundary example (causality): “Run a usability test to prove the redesign will increase retention by 10%.” Response: explain limits of small-n usability; recommend pairing with instrumentation/experimentation for causality and use usability to diagnose friction.

Capabilities

skillsource-liqiongyuskill-usability-testingtopic-agent-skillstopic-ai-agentstopic-automationtopic-claudetopic-codextopic-prompt-engineeringtopic-refoundaitopic-skillpack

Install

Installnpx skills add liqiongyu/lenny_skills_plus

Sourcehttps://github.com/liqiongyu/lenny_skills_plus/tree/main/skills/usability-testing

skills.shhttps://skills.sh/liqiongyu/lenny_skills_plus/usability-testing

Transportskills-sh

Protocolskill

Quality

0.47/ 1.00

deterministic score 0.47 from registry signals: · indexed on github topic:agent-skills · 49 github stars · SKILL.md body (8,840 chars)

Provenance

Indexed fromgithub

Enriched2026-04-22 00:56:26Z · deterministic:skill-github:v1 · v1

First seen2026-04-18

Last seen2026-04-22

Agent access

JSONhttps://clawmart.sh/api/listings/g8sn6C