Skillquality 0.46

testing-discipline

Use when writing or reviewing tests, designing test data, naming a test, choosing what to assert, or writing test helpers/mocks/fixtures

Price
free
Protocol
skill
Verified
no

What it does

Testing Discipline

Overview

A good test pins the contract, uses concrete examples, names the scenario in domain language, and uses test data safe to show a customer. Run the checklist in order when writing or reviewing a test.

When to invoke

Invoke when you're about to:

  • Write a new test (unit, integration, end-to-end, characterization)
  • Name or rename a test method, test class, or test file
  • Design a fixture, mock, stub, or shared test helper
  • Choose test data — names, IDs, sample strings, sample numbers
  • Decide what to assert in a test (exact value? structural property? both?)
  • Schedule a long-running suite (soak, perf, cross-platform matrix)
  • Hand off requirements between programmers and testers, or work on acceptance tests
  • Review existing tests for quality, coverage gaps, or test smells
  • Evaluate whether a test suite adequately covers a contract

Non-triggers — do NOT invoke for

  • Running the existing test suite (npm test, pytest)
  • Fixing a one-line broken assertion where the intent is unchanged
  • Adding a docstring or comment to an existing test
  • Reordering imports in a test file
  • Updating a snapshot file when the underlying intentional change is already understood

If you're not sure whether a change counts as "writing a test," invoke anyway — the checklist is short and skipping it produces tests that pass forever while protecting nothing.

Test quality checklist

Run every step in order when writing a new test. When reviewing, use the same checks to evaluate quality.

  1. Tests are part of the change, not optional polish. Software's only practical pre-deployment validation is execution under realistic conditions. A change without tests is an incomplete change. (Ford, 97/83.)
  2. Assert the required behavior, not an incidental of the current implementation. Ask: what does the contract actually promise? For a 3-way comparator the contract is "negative if less, positive if greater, zero if equal" — not "exactly -1 or +1." Asserting ±1 nails an incidental and will go red the day someone returns -2. Tests that mirror code structure end up asserting that the code does what the code does. (Henney, 97/80.)
  3. Be precise and accurate. Use concrete examples. "Result is sorted and same length" passes for [3,3,3,3,3,3] against an input of [3,1,4,1,5,9]. The full postcondition is sorted and a permutation of the input — but expressing that as a generic checker is often more code than the function under test. Prefer a concrete example pair: input [3,1,4,1,5,9], expected [1,1,3,4,5,9]. "Adding to an empty collection" is not "now non-empty" — it's "now contains exactly one item, and that item is X." (Henney, 97/81.)
  4. Write the test for the next person who has to read it. A good test is documentation. Make three parts visible in this order: the context/preconditions, the call into the system, the expected result. Hide trivia behind named helpers (Extract Method) so the reader sees the scenario, not the scaffolding. Give the test a name describing the scenario and the entry point — Stack_pop_on_empty_throws reads better than test17. Then test the test: introduce a deliberate bug into the code under test on a private branch and verify the failure message tells you what went wrong. (Meszaros, 97/95.)
  5. Choose test data that is safe in front of a customer. Placeholder names, fake company names, sample log strings, mock error dialogs, and seeded database rows have a habit of surfacing in screenshots, demos, leaked source, and production logs. Do not use real people's names you mean to mock, do not use band names or song titles as a private joke, do not write "don't click that again, you moron" as a placeholder dialog, do not use four-letter words as fake stock tickers. Use boring, obviously-fake, professional data. (Begbie, 97/25.)
  6. Accept acceptance tests as requirements input. When acceptance tests exist for a behavior (from QA, from the customer, from a spec), use them as input to the work — they surface ambiguities cheaper than defect tickets. (Hufnagel, 97/60; Gregory, 97/92.)

Red Flags

These thoughts mean STOP — restart the checklist:

ThoughtReality
"I'll just assert the function returns exactly -1."The contract says "negative" — ±1 is an implementation incidental. The test will go red on a valid refactor and tell you nothing about the requirement. (97/80)
"Same length, all elements in range — that's enough for a sort test."[3,3,3,3,3,3] satisfies that and is wrong. Use a concrete input/output pair so the only correct answer is the one in the assertion. (97/81)
"I'll seed the database with band members and song titles — funnier than User1."Cute test data ends up in customer screenshots, demos, and leaked source. Use boring, obviously-fake, professional data. (97/25)
"No time to write the test — we'll add it later.""Later" rarely arrives, and shipping without verification is professionally irresponsible. The test is part of the change. (97/83)
"I know what this test means — I just wrote it."You will not, in six months. Tests are read more than they are written. Name the scenario, structure as context/act/assert, hide scaffolding. (97/95)
"test17 is a fine name — the body explains it."Test names are scanned to verify coverage and to read failure reports. Encode the scenario and the entry point in the name. (97/95)
"The test setup is fifty lines, then a one-line assert — but it works."Test pain is design pressure. If the setup dwarfs the body, reshape the production code. Do not mock harder. (GOOS/ListenToTestPain)
"I'll assert that repo.save was called exactly three times."Over-specifying mock interactions makes the test red on innocent refactors. Assert on the observable contract, not the call shape. (xUnit/FragileTest)
"The test reads from /tmp/fixtures/users.json set up in conftest.py."Mystery Guest. The test cannot be read in isolation. Build the fixture in the test or in a function named for what it returns. (xUnit/MysteryGuest)
"I'll branch on the return value and assert different things in each branch."One test, two scenarios fighting for one name. Split into two tests, or use named parameterized cases. (xUnit/ConditionalTestLogic)

What "done" looks like

A single well-written test is done when all of the following are true:

  • The assertion targets the contract, not an incidental of the current implementation.
  • The expected result is concrete enough that a reader can check it by eye in under thirty seconds.
  • The test name describes the scenario and the entry point in domain language.
  • Context, action, and expected result are visible as three readable sections; scaffolding is behind named helpers.
  • Test data (names, IDs, strings, log lines, error messages) is safe to appear in a customer screenshot.
  • You verified the test fails for the right reason — by introducing a deliberate bug on a private branch and reading the failure message.

If any box is unchecked, the test is not done. Either finish, or delete it and start over.

Principles in this skill

#PrincipleAuthor
97/25Don't Be Cute with Your Test DataRod Begbie
97/60News of the Weird: Testers Are Your FriendsBurk Hufnagel
97/80Test for Required Behavior, Not Incidental BehaviorKevlin Henney
97/81Test Precisely and ConcretelyKevlin Henney
97/82Test While You Sleep (and over Weekends)Rajith Attapattu
97/83Testing Is the Engineering Rigor of Software DevelopmentNeal Ford
97/92When Programmers and Testers CollaborateJanet Gregory
97/95Write Tests for PeopleGerard Meszaros
GOOS/ListenToTestPainListen to Test PainSteve Freeman & Nat Pryce
xUnit/ObscureTestObscure TestGerard Meszaros
xUnit/FragileTestFragile TestGerard Meszaros
xUnit/MysteryGuestMystery GuestGerard Meszaros
xUnit/ConditionalTestLogicConditional Test LogicGerard Meszaros

See principles.md for the long-form distillations, citations, and source links.

Capabilities

skillsource-oribarilanskill-testing-disciplinetopic-agent-skillstopic-ai-agentstopic-best-practicestopic-claude-codetopic-claude-code-plugintopic-claude-code-skillstopic-coding-agentstopic-copilot-clitopic-copilot-cli-plugintopic-opencodetopic-opencode-plugintopic-programming-principles

Install

Installnpx skills add oribarilan/97
Transportskills-sh
Protocolskill

Quality

0.46/ 1.00

deterministic score 0.46 from registry signals: · indexed on github topic:agent-skills · 21 github stars · SKILL.md body (8,230 chars)

Provenance

Indexed fromgithub
Enriched2026-05-18 19:05:33Z · deterministic:skill-github:v1 · v1
First seen2026-05-08
Last seen2026-05-18

Agent access