{"id":"0aa97903-dbe3-45ce-8756-b3e418bc4ad4","shortId":"FAbvKM","kind":"skill","title":"test-driven-development","tagline":"Red-green-refactor development methodology requiring verified test coverage. Use for feature implementation, bugfixes, refactoring, or any behavior changes where tests must prove correctness.","description":"# Test-Driven Development\n\nWrite test first. Watch it fail. Write minimal code to pass. Refactor.\n\n**Core principle:** If you didn't watch the test fail, you don't know if it tests the right thing.\n\n## The Iron Law\n\n```\nNO BEHAVIOR-CHANGING PRODUCTION CODE WITHOUT A FAILING TEST FIRST\n```\n\nWrote code before test? Delete it completely. Implement fresh from tests.\n\n**Refactoring is exempt:** The refactor step changes structure, not behavior. Tests stay green throughout. No new failing test required.\n\n## Red-Green-Refactor Cycle\n\n```\nRED ──► Verify Fail ──► GREEN ──► Verify Pass ──► REFACTOR ──► Verify Pass ──► Next RED\n         │                         │                            │\n         ▼                         ▼                            ▼\n      Wrong failure?           Still failing?              Broke tests?\n      Fix test, retry          Fix code, retry             Fix, retry\n```\n\n### RED - Write Failing Test\n\nWrite one minimal test for one behavior.\n\n**Good example:**\n```typescript\ntest('retries failed operations 3 times', async () => {\n  let attempts = 0;\n  const operation = async () => {\n    attempts++;\n    if (attempts < 3) throw new Error('fail');\n    return 'success';\n  };\n\n  const result = await retryOperation(operation);\n\n  expect(result).toBe('success');\n  expect(attempts).toBe(3);\n});\n```\n*Clear name, tests real behavior, asserts observable outcome*\n\n**Bad example:**\n```typescript\ntest('retry works', async () => {\n  const mock = jest.fn()\n    .mockRejectedValueOnce(new Error())\n    .mockRejectedValueOnce(new Error())\n    .mockResolvedValueOnce('success');\n  await retryOperation(mock);\n  expect(mock).toHaveBeenCalledTimes(3);\n});\n```\n*Vague name, asserts only call count without verifying outcome, tests mock mechanics not behavior*\n\n**Requirements:** One behavior. Clear name. Real code (mocks only if unavoidable).\n\n### Verify RED - Watch It Fail\n\n**MANDATORY. Never skip.**\n\n```bash\nnpm test path/to/test.test.ts\n```\n\nTest must go red for the right reason. Acceptable RED states:\n- Assertion failure (expected behavior missing)\n- Compile/type error (function doesn't exist yet)\n\nNot acceptable: Runtime setup errors, import failures, environment issues.\n\nTest passes immediately? You're testing existing behavior—fix test.\nTest errors for wrong reason? Fix error, re-run until it fails correctly.\n\n### GREEN - Minimal Code\n\nWrite simplest code to pass the test.\n\n**Good example:**\n```typescript\nasync function retryOperation<T>(fn: () => Promise<T>): Promise<T> {\n  for (let i = 0; i < 3; i++) {\n    try {\n      return await fn();\n    } catch (e) {\n      if (i === 2) throw e;\n    }\n  }\n  throw new Error('unreachable');\n}\n```\n*Just enough to pass*\n\n**Bad example:**\n```typescript\nasync function retryOperation<T>(\n  fn: () => Promise<T>,\n  options?: { maxRetries?: number; backoff?: 'linear' | 'exponential'; }\n): Promise<T> { /* YAGNI */ }\n```\n*Over-engineered beyond test requirements*\n\nWrite only what the test demands. No extra features, no \"improvements.\"\n\n### Verify GREEN - Watch It Pass\n\n**MANDATORY.**\n\n```bash\nnpm test path/to/test.test.ts\n```\n\nConfirm: Test passes. All other tests still pass. Output pristine (no errors, warnings).\n\nTest fails? Fix code, not test.\nOther tests fail? Fix now before continuing.\n\n### REFACTOR - Clean Up\n\nAfter green only: Remove duplication. Improve names. Extract helpers.\n\nKeep tests green throughout. Add no new behavior.\n\n### Repeat\n\nNext failing test for next behavior.\n\n## Good Tests\n\n**Minimal:** One thing per test. \"and\" in name? Split it. ❌ `test('validates email and domain and whitespace')`\n\n**Clear:** Name describes behavior. ❌ `test('test1')`\n\n**Shows intent:** Demonstrates desired API usage, not implementation details.\n\n## Example: Bug Fix\n\n**Bug:** Empty email accepted\n\n**RED:**\n```typescript\ntest('rejects empty email', async () => {\n  const result = await submitForm({ email: '' });\n  expect(result.error).toBe('Email required');\n});\n```\n\n**Verify RED:**\n```bash\n$ npm test\nFAIL: expected 'Email required', got undefined\n```\n\n**GREEN:**\n```typescript\nfunction submitForm(data: FormData) {\n  if (!data.email?.trim()) {\n    return { error: 'Email required' };\n  }\n  // ...\n}\n```\n\n**Verify GREEN:**\n```bash\n$ npm test\nPASS\n```\n\n**REFACTOR:** Extract validation helper if pattern repeats.\n\n## Red Flags - STOP and Start Over\n\nAny of these means delete code and restart with TDD:\n\n- Code written before test\n- Test passes immediately (testing existing behavior)\n- Can't explain why test failed\n- Rationalizing \"just this once\" or \"this is different\"\n- Keeping code \"as reference\" while writing tests\n- Claiming \"tests after achieve the same purpose\"\n\n## When Stuck\n\n| Problem | Solution |\n|---------|----------|\n| Don't know how to test | Write the API you wish existed. Write assertion first. |\n| Test too complicated | Design too complicated. Simplify the interface. |\n| Must mock everything | Code too coupled. Introduce dependency injection. |\n| Test setup huge | Extract helpers. Still complex? Simplify design. |\n\n## Legacy Code (No Existing Tests)\n\nThe Iron Law (\"delete and restart\") applies to **new code you wrote without tests**. For inherited code with no tests, use characterization tests:\n\n1. Write tests that capture current behavior (even if \"wrong\")\n2. Run tests, observe actual outputs\n3. Update assertions to match reality (these are \"golden masters\")\n4. Now you have a safety net for refactoring\n5. Apply TDD for new behavior changes\n\nCharacterization tests lock down existing behavior so you can refactor safely. They're the on-ramp, not a permanent state.\n\n## Flakiness Rules\n\nTests must be deterministic. Ban these in unit tests:\n\n- **Real sleeps / delays** → Use fake timers (`vi.useFakeTimers()`, `jest.useFakeTimers()`)\n- **Wall clock time** → Inject clock, assert against injected time\n- **Math.random()** → Seed or inject RNG\n- **Network calls** → Mock at boundary or use MSW\n- **Filesystem race conditions** → Use temp dirs with unique names\n\nFlaky test? Fix or delete. Flaky tests erode trust in the entire suite.\n\n## Debugging Integration\n\nBug found? Write failing test reproducing it first. Then follow TDD cycle. Test proves fix and prevents regression.\n\n## Planning: Test List\n\nBefore diving into the cycle, spend 2 minutes listing the next 3-10 tests you expect to write. This prevents local-optimum design where early tests paint you into a corner.\n\nExample test list for a retry function:\n- retries N times on failure\n- returns result on success\n- throws after max retries exhausted  \n- calls onRetry callback between attempts\n- respects backoff delay\n\nWork through the list in order. Add/remove tests as you learn.\n\n## Testing Anti-Patterns\n\nWhen writing tests involving mocks, dependencies, or test utilities: See [references/testing-anti-patterns.md](references/testing-anti-patterns.md) for common pitfalls including testing mock behavior and adding test-only methods to production classes.\n\n## Philosophy and Rationalizations\n\nFor detailed rebuttals to common objections (\"I'll test after\", \"deleting work is wasteful\", \"TDD is dogmatic\"): See [references/tdd-philosophy.md](references/tdd-philosophy.md)\n\n## Final Rule\n\n```\nProduction code exists → test existed first and failed first\nOtherwise → not TDD\n```","tags":["test","driven","development","agent","skills","library","codingcossack","agent-framework","agent-skills","agent-system","agent-workflow","agentic-workflow"],"capabilities":["skill","source-codingcossack","skill-test-driven-development","topic-agent-framework","topic-agent-skills","topic-agent-system","topic-agent-workflow","topic-agentic-workflow","topic-ai-agents","topic-anthropic","topic-claude","topic-claude-code","topic-claude-skills","topic-claude-skills-hub","topic-claude-skills-libary"],"categories":["agent-skills-library"],"synonyms":[],"warnings":[],"endpointUrl":"https://skills.sh/CodingCossack/agent-skills-library/test-driven-development","protocol":"skill","transport":"skills-sh","auth":{"type":"none","details":{"cli":"npx skills add CodingCossack/agent-skills-library","source_repo":"https://github.com/CodingCossack/agent-skills-library","install_from":"skills.sh"}},"qualityScore":"0.458","qualityRationale":"deterministic score 0.46 from registry signals: · indexed on github topic:agent-skills · 17 github stars · SKILL.md body (7,263 chars)","verified":false,"liveness":"unknown","lastLivenessCheck":null,"agentReviews":{"count":0,"score_avg":null,"cost_usd_avg":null,"success_rate":null,"latency_p50_ms":null,"narrative_summary":null,"summary_updated_at":null},"enrichmentModel":"deterministic:skill-github:v1","enrichmentVersion":1,"enrichedAt":"2026-04-23T07:01:19.748Z","embedding":null,"createdAt":"2026-04-18T23:06:40.557Z","updatedAt":"2026-04-23T07:01:19.748Z","lastSeenAt":"2026-04-23T07:01:19.748Z","tsv":"'-10':841 '0':163,338 '1':680 '2':350,690,835 '3':158,170,189,222,340,696,840 '4':706 '5':715 'accept':268,284,497 'achiev':602 'actual':694 'ad':925 'add':446 'add/remove':896 'anti':903 'anti-pattern':902 'api':486,618 'appli':663,716 'assert':195,225,271,623,698,767 'async':160,166,204,329,364,504 'attempt':162,167,169,187,886 'await':179,216,344,507 'backoff':372,888 'bad':198,361 'ban':749 'bash':256,400,517,541 'behavior':23,71,100,150,194,236,239,274,299,449,456,479,577,686,720,727,923 'behavior-chang':70 'beyond':380 'boundari':780 'broke':130 'bug':492,494,808 'bugfix':19 'call':227,777,882 'callback':884 'captur':684 'catch':346 'chang':24,72,97,721 'character':678,722 'claim':599 'class':932 'clean':431 'clear':190,240,476 'clock':763,766 'code':42,74,81,136,243,318,321,420,563,568,593,637,653,666,673,959 'common':918,940 'compile/type':276 'complet':86 'complex':649 'complic':627,630 'condit':786 'confirm':404 'const':164,177,205,505 'continu':429 'core':46 'corner':860 'correct':29,315 'count':228 'coupl':639 'coverag':14 'current':685 'cycl':114,819,833 'data':530 'data.email':533 'debug':806 'delay':756,889 'delet':84,562,660,797,946 'demand':388 'demonstr':484 'depend':641,910 'describ':478 'design':628,651,852 'desir':485 'detail':490,937 'determinist':748 'develop':4,9,33 'didn':50 'differ':591 'dir':789 'dive':830 'doesn':279 'dogmat':952 'domain':473 'driven':3,32 'duplic':437 'e':347,352 'earli':854 'email':471,496,503,509,513,522,537 'empti':495,502 'engin':379 'enough':358 'entir':804 'environ':290 'erod':800 'error':173,210,213,277,287,303,308,355,415,536 'even':687 'everyth':636 'exampl':152,199,327,362,491,861 'exempt':93 'exhaust':881 'exist':281,298,576,621,655,726,960,962 'expect':182,186,219,273,510,521,844 'explain':580 'exponenti':374 'extra':390 'extract':440,546,646 'fail':39,55,77,107,117,129,142,156,174,252,314,418,425,452,520,583,811,965 'failur':127,272,289,872 'fake':758 'featur':17,391 'filesystem':784 'final':956 'first':36,79,624,815,963,966 'fix':132,135,138,300,307,419,426,493,795,822 'flag':553 'flaki':743,793,798 'fn':332,345,367 'follow':817 'formdata':531 'found':809 'fresh':88 'function':278,330,365,528,867 'go':262 'golden':704 'good':151,326,457 'got':524 'green':7,103,112,118,316,395,434,444,526,540 'helper':441,548,647 'huge':645 'immedi':294,574 'implement':18,87,489 'import':288 'improv':393,438 'includ':920 'inherit':672 'inject':642,765,769,774 'integr':807 'intent':483 'interfac':633 'introduc':640 'involv':908 'iron':67,658 'issu':291 'jest.fn':207 'jest.usefaketimers':761 'keep':442,592 'know':59,612 'law':68,659 'learn':900 'legaci':652 'let':161,336 'linear':373 'list':828,837,863,893 'll':943 'local':850 'local-optimum':849 'lock':724 'mandatori':253,399 'master':705 'match':700 'math.random':771 'max':879 'maxretri':370 'mean':561 'mechan':234 'method':929 'methodolog':10 'minim':41,146,317,459 'minut':836 'miss':275 'mock':206,218,220,233,244,635,778,909,922 'mockrejectedvalueonc':208,211 'mockresolvedvalueonc':214 'msw':783 'must':27,261,634,746 'n':869 'name':191,224,241,439,466,477,792 'net':712 'network':776 'never':254 'new':106,172,209,212,354,448,665,719 'next':124,451,455,839 'npm':257,401,518,542 'number':371 'object':941 'observ':196,693 'on-ramp':736 'one':145,149,238,460 'onretri':883 'oper':157,165,181 'optimum':851 'option':369 'order':895 'otherwis':967 'outcom':197,231 'output':412,695 'over-engin':377 'paint':856 'pass':44,120,123,293,323,360,398,406,411,544,573 'path/to/test.test.ts':259,403 'pattern':550,904 'per':462 'perman':741 'philosophi':933 'pitfal':919 'plan':826 'prevent':824,848 'principl':47 'pristin':413 'problem':608 'product':73,931,958 'promis':333,334,368,375 'prove':28,821 'purpos':605 'race':785 'ramp':738 'ration':584,935 're':296,310,734 're-run':309 'real':193,242,754 'realiti':701 'reason':267,306 'rebutt':938 'red':6,111,115,125,140,249,263,269,498,516,552 'red-green-refactor':5,110 'refactor':8,20,45,91,95,113,121,430,545,714,731 'refer':595 'references/tdd-philosophy.md':954,955 'references/testing-anti-patterns.md':915,916 'regress':825 'reject':501 'remov':436 'repeat':450,551 'reproduc':813 'requir':11,109,237,382,514,523,538 'respect':887 'restart':565,662 'result':178,183,506,874 'result.error':511 'retri':134,137,139,155,202,866,868,880 'retryoper':180,217,331,366 'return':175,343,535,873 'right':64,266 'rng':775 'rule':744,957 'run':311,691 'runtim':285 'safe':732 'safeti':711 'see':914,953 'seed':772 'setup':286,644 'show':482 'simplest':320 'simplifi':631,650 'skill' 'skill-test-driven-development' 'skip':255 'sleep':755 'solut':609 'source-codingcossack' 'spend':834 'split':467 'start':556 'state':270,742 'stay':102 'step':96 'still':128,410,648 'stop':554 'structur':98 'stuck':607 'submitform':508,529 'success':176,185,215,876 'suit':805 'tdd':567,717,818,950,969 'temp':788 'test':2,13,26,31,35,54,62,78,83,90,101,108,131,133,143,147,154,192,201,232,258,260,292,297,301,302,325,381,387,402,405,409,417,422,424,443,453,458,463,469,480,500,519,543,571,572,575,582,598,600,615,625,643,656,670,676,679,682,692,723,745,753,794,799,812,820,827,842,855,862,897,901,907,912,921,927,944,961 'test-driven':30 'test-driven-develop':1 'test-on':926 'test1':481 'thing':65,461 'throughout':104,445 'throw':171,351,353,877 'time':159,764,770,870 'timer':759 'tobe':184,188,512 'tohavebeencalledtim':221 'topic-agent-framework' 'topic-agent-skills' 'topic-agent-system' 'topic-agent-workflow' 'topic-agentic-workflow' 'topic-ai-agents' 'topic-anthropic' 'topic-claude' 'topic-claude-code' 'topic-claude-skills' 'topic-claude-skills-hub' 'topic-claude-skills-libary' 'tri':342 'trim':534 'trust':801 'typescript':153,200,328,363,499,527 'unavoid':247 'undefin':525 'uniqu':791 'unit':752 'unreach':356 'updat':697 'usag':487 'use':15,677,757,782,787 'util':913 'vagu':223 'valid':470,547 'verifi':12,116,119,122,230,248,394,515,539 'vi.usefaketimers':760 'wall':762 'warn':416 'wast':949 'watch':37,52,250,396 'whitespac':475 'wish':620 'without':75,229,669 'work':203,890,947 'write':34,40,141,144,319,383,597,616,622,681,810,846,906 'written':569 'wrong':126,305,689 'wrote':80,668 'yagni':376 'yet':282","prices":[{"id":"859f763b-2bd6-454a-ae53-3ede9600a192","listingId":"0aa97903-dbe3-45ce-8756-b3e418bc4ad4","amountUsd":"0","unit":"free","nativeCurrency":null,"nativeAmount":null,"chain":null,"payTo":null,"paymentMethod":"skill-free","isPrimary":true,"details":{"org":"CodingCossack","category":"agent-skills-library","install_from":"skills.sh"},"createdAt":"2026-04-18T23:06:40.557Z"}],"sources":[{"listingId":"0aa97903-dbe3-45ce-8756-b3e418bc4ad4","source":"github","sourceId":"CodingCossack/agent-skills-library/test-driven-development","sourceUrl":"https://github.com/CodingCossack/agent-skills-library/tree/main/skills/test-driven-development","isPrimary":false,"firstSeenAt":"2026-04-18T23:06:40.557Z","lastSeenAt":"2026-04-23T07:01:19.748Z"}],"details":{"listingId":"0aa97903-dbe3-45ce-8756-b3e418bc4ad4","quickStartSnippet":null,"exampleRequest":null,"exampleResponse":null,"schema":null,"openapiUrl":null,"agentsTxtUrl":null,"citations":[],"useCases":[],"bestFor":[],"notFor":[],"kindDetails":{"org":"CodingCossack","slug":"test-driven-development","github":{"repo":"CodingCossack/agent-skills-library","stars":17,"topics":["agent-framework","agent-skills","agent-system","agent-workflow","agentic-workflow","ai-agents","anthropic","claude","claude-code","claude-skills","claude-skills-hub","claude-skills-libary","code-review","codex","context-engineering","debugging","developer-workflow"],"license":null,"html_url":"https://github.com/CodingCossack/agent-skills-library","pushed_at":"2026-01-03T20:02:38Z","description":"Coding agent skills library for programming workflows | Claude Skills, Codex Skills | Forked from obra/superpower","skill_md_sha":"6fb9cc3c193aeafca8214372b97bf8d5062633a8","skill_md_path":"skills/test-driven-development/SKILL.md","default_branch":"main","skill_tree_url":"https://github.com/CodingCossack/agent-skills-library/tree/main/skills/test-driven-development"},"layout":"multi","source":"github","category":"agent-skills-library","frontmatter":{"name":"test-driven-development","description":"Red-green-refactor development methodology requiring verified test coverage. Use for feature implementation, bugfixes, refactoring, or any behavior changes where tests must prove correctness."},"skills_sh_url":"https://skills.sh/CodingCossack/agent-skills-library/test-driven-development"},"updatedAt":"2026-04-23T07:01:19.748Z"}}