{"id":"43476191-e1e3-4a08-a788-4c785370aa17","shortId":"BPVJst","kind":"skill","title":"Flaky Test Analyzer","tagline":"Diagnoses why tests pass inconsistently and suggests fixes for timing, ordering, and state isolation issues.","description":"# Flaky Test Analyzer\n\n## What this skill does\n\nThis skill directs the agent to diagnose flaky tests — tests that sometimes pass and sometimes fail without any code changes. It examines the test code, the code under test, and the failure patterns to identify the root cause category (timing, shared state, ordering, network, randomness, etc.) and then suggests targeted fixes that make the test deterministic.\n\nUse this when a test is unreliable in CI, when a test passes locally but fails on the CI server, or when a test fails intermittently with no obvious pattern.\n\n## How to use\n\n### Claude Code / Cline\n\nCopy this file to `.agents/skills/flaky-test-analyzer/SKILL.md` in your project root.\n\nThen ask:\n- *\"Use the Flaky Test Analyzer skill on `tests/checkout.test.ts` — it fails about 1 in 5 runs in CI.\"*\n- *\"This test passes locally but fails in CI. Use the Flaky Test Analyzer skill to diagnose it.\"*\n\nProvide:\n1. The test file (or the specific test that's flaky)\n2. The failure message when it does fail\n3. How often it fails (every time? 1 in 10? only in CI?)\n4. Any observations about when it fails (after a specific other test? at a specific time of day?)\n\n### Cursor\n\nAdd the instructions below to your `.cursorrules` or paste them into the Cursor AI pane. Provide the test code and failure output.\n\n### Codex\n\nPaste the test file, the failure message, and any relevant context. Ask Codex to follow the instructions below.\n\n## The Prompt / Instructions for the Agent\n\nWhen asked to diagnose a flaky test, follow this process:\n\n### Step 1 — Gather information\n\nBefore analyzing, ensure you have:\n- The full test code, including `beforeEach`, `afterEach`, `beforeAll`, and `afterAll` hooks\n- The code under test (the function or module being tested)\n- The error message when the test fails (not just \"test failed\")\n- The failure frequency pattern (always, sometimes, CI only, etc.)\n- The test framework and any relevant configuration (jest.config.js, vitest.config.ts, etc.)\n\nIf any of these are missing, ask for them.\n\n### Step 2 — Identify the flakiness category\n\nCheck for each of these flakiness patterns in order:\n\n**Timing and async issues**\n- `setTimeout` or `setInterval` with hardcoded delays that may not be long enough\n- Missing `await` on async operations\n- Polling for a condition with a timeout that's too short\n- Fake timers (`jest.useFakeTimers`) mixed with real timers\n- Tests that depend on the exact time of day or system clock\n\n**Shared state and isolation**\n- Global variables modified in one test but not reset before the next\n- Database records created in one test that affect another\n- File system changes not cleaned up\n- In-memory caches not cleared between tests\n- Static class properties mutated during tests\n- Singleton services not reset\n\n**Test ordering dependencies**\n- Test A passes only when test B ran first (shared setup)\n- Test A fails when run in isolation but passes in the suite\n- Tests that rely on a specific execution order\n\n**External dependencies**\n- HTTP calls to real external APIs (network flakiness, rate limits)\n- File system reads of files that may not exist in CI\n- Environment variables that differ between local and CI\n- Random number generation with no seed\n\n**Race conditions in the code under test**\n- Concurrent operations with no locking\n- Event listeners that fire at unpredictable times\n- `Promise.all` with side effects that interfere\n\n**Test framework issues**\n- Snapshot files out of date\n- Test timeouts set too low for the environment\n\n### Step 3 — Diagnose the specific test\n\nAfter identifying which category applies, pinpoint the exact line(s) causing the flakiness. Explain:\n- What assumption the test is making\n- Why that assumption is sometimes wrong\n- What condition makes it fail vs pass\n\n### Step 4 — Recommend a fix\n\nProvide a concrete, specific fix. Common fixes include:\n- Replace `setTimeout(() => ..., 100)` with `waitFor(() => ...)` or `vi.runAllTimers()`\n- Add `afterEach(() => { jest.clearAllMocks(); db.cleanup(); })`\n- Mock the external API instead of calling it\n- Use `jest.useFakeTimers()` consistently and advance time explicitly\n- Use a test database that's wiped between runs\n- Set a random seed for deterministic random values\n- Add proper `await` to async operations\n\n### Step 5 — Format the output\n\n```markdown\n## Flaky Test Diagnosis\n\n### Test\n`[test name]` in `[file path]`\n\n### Failure Pattern\n[How often it fails, under what conditions]\n\n### Root Cause Category\n[Timing / Shared state / Test ordering / External dependency / Race condition / Other]\n\n### Root Cause\n[2–3 sentences explaining exactly why the test is flaky — what assumption it makes and why that assumption sometimes fails]\n\n### The Problematic Code\n[Quote the specific lines that cause the flakiness]\n\n### Fix\n[The specific change(s) to make the test deterministic, with code]\n\n### Why This Fix Works\n[1–2 sentences explaining why the fix eliminates the non-determinism]\n\n### Prevention\n[1 sentence on how to avoid this class of flakiness in future tests]\n```\n\n## Example\n\n**Input to Agent:**\n> \"Use the Flaky Test Analyzer skill. This test fails about 1 in 4 runs in CI:\n>\n> ```ts\n> it('sends a welcome email after registration', async () => {\n>   await registerUser({ email: 'test@example.com', password: 'abc123' });\n>   setTimeout(() => {\n>     expect(emailService.send).toHaveBeenCalledWith({\n>       to: 'test@example.com',\n>       subject: 'Welcome!'\n>     });\n>   }, 200);\n> });\n> ```\"\n\n**Output from Agent:**\n\n> ## Flaky Test Diagnosis\n>\n> ### Test\n> `sends a welcome email after registration` — `user.test.ts`\n>\n> ### Failure Pattern\n> Fails ~25% of runs, more often in CI than locally.\n>\n> ### Root Cause Category\n> Timing / async issue\n>\n> ### Root Cause\n> The `expect` inside `setTimeout` is not awaited, so Jest considers the test done as soon as `registerUser` resolves — the assertion runs 200ms later, after Jest has already moved on. When the event loop is busy (common in CI), the 200ms delay may not be enough, and even when it fires, Jest may no longer be tracking it.\n>\n> ### The Problematic Code\n> ```ts\n> setTimeout(() => {\n>   expect(emailService.send).toHaveBeenCalledWith(...);\n> }, 200);\n> ```\n>\n> ### Fix\n> ```ts\n> it('sends a welcome email after registration', async () => {\n>   await registerUser({ email: 'test@example.com', password: 'abc123' });\n>   await waitFor(() => {\n>     expect(emailService.send).toHaveBeenCalledWith({\n>       to: 'test@example.com',\n>       subject: 'Welcome!'\n>     });\n>   });\n> });\n> ```\n> If `waitFor` isn't available (non-React project), use `jest.runAllTimers()` with `jest.useFakeTimers()` or poll manually with a small helper.\n>\n> ### Why This Fix Works\n> `waitFor` repeatedly retries the assertion until it passes or a timeout is reached, making the test resilient to variable execution timing.\n>\n> ### Prevention\n> Never put `expect()` calls inside a raw `setTimeout` — always await the operation that triggers the side effect, or use `waitFor`.\n\n## Notes\n\n- Always provide the failure message, not just \"it fails.\" The exact error often reveals the category immediately.\n- \"Passes locally, fails in CI\" is usually a timing issue (slower machines) or a missing environment variable.\n- \"Fails only when run with other tests\" is almost always a shared state issue — try running the failing test in isolation first to confirm.","tags":["flaky","test","analyzer","openagentskills","notysoty","agent-skills","claude","claude-code","claude-skills","cline","cursor","llm"],"capabilities":["skill","source-notysoty","skill-flaky-test-analyzer","topic-agent-skills","topic-claude","topic-claude-code","topic-claude-skills","topic-cline","topic-cursor","topic-llm","topic-llm-skills","topic-skills"],"categories":["openagentskills"],"synonyms":[],"warnings":[],"endpointUrl":"https://skills.sh/Notysoty/openagentskills/flaky-test-analyzer","protocol":"skill","transport":"skills-sh","auth":{"type":"none","details":{"cli":"npx skills add Notysoty/openagentskills","source_repo":"https://github.com/Notysoty/openagentskills","install_from":"skills.sh"}},"qualityScore":"0.454","qualityRationale":"deterministic score 0.45 from registry signals: · indexed on github topic:agent-skills · 8 github stars · SKILL.md body (7,366 chars)","verified":false,"liveness":"unknown","lastLivenessCheck":null,"agentReviews":{"count":0,"score_avg":null,"cost_usd_avg":null,"success_rate":null,"latency_p50_ms":null,"narrative_summary":null,"summary_updated_at":null},"enrichmentModel":"deterministic:skill-github:v1","enrichmentVersion":1,"enrichedAt":"2026-05-18T19:13:21.749Z","embedding":null,"createdAt":"2026-05-18T13:20:42.657Z","updatedAt":"2026-05-18T19:13:21.749Z","lastSeenAt":"2026-05-18T19:13:21.749Z","tsv":"'1':140,164,190,273,755,768,795 '10':192 '100':622 '2':175,342,708,756 '200':824,924 '200ms':880,898 '25':842 '3':183,569,709 '4':196,608,797 '5':142,670 'abc123':815,940 'add':215,627,663 'advanc':643 'affect':430 'afteral':290 'aftereach':287,628 'agent':30,261,784,827 'agents/skills/flaky-test-analyzer/skill.md':122 'ai':228 'almost':1059 'alreadi':885 'alway':317,1004,1017,1060 'analyz':3,21,133,158,277,789 'anoth':431 'api':497,634 'appli':578 'ask':128,249,263,338 'assert':878,978 'assumpt':589,596,719,725 'async':358,375,667,809,855,934 'avail':954 'avoid':773 'await':373,665,810,865,935,941,1005 'b':465 'beforeal':288 'beforeeach':286 'busi':893 'cach':441 'call':493,637,999 'categori':64,346,577,695,853,1032 'caus':63,584,694,707,736,852,858 'chang':45,434,742 'check':347 'ci':90,100,145,153,195,319,512,520,800,848,896,1038 'class':447,775 'claud':115 'clean':436 'clear':443 'cline':117 'clock':406 'code':44,50,52,116,233,284,293,531,730,750,918 'codex':237,250 'common':617,894 'concret':614 'concurr':534 'condit':380,528,601,692,704 'configur':328 'confirm':1074 'consid':868 'consist':641 'context':248 'copi':118 'creat':425 'cursor':214,227 'cursorrul':221 'databas':423,649 'date':559 'day':213,403 'db.cleanup':630 'delay':365,899 'depend':397,458,491,702 'determin':766 'determinist':81,660,748 'diagnos':4,32,161,265,570 'diagnosi':677,830 'differ':516 'direct':28 'done':871 'effect':549,1012 'elimin':762 'email':806,812,835,931,937 'emailservice.send':818,922,944 'enough':371,903 'ensur':278 'environ':513,567,1049 'error':303,1028 'etc':71,321,331 'even':905 'event':539,890 'everi':188 'exact':400,581,712,1027 'examin':47 'exampl':781 'execut':488,993 'exist':510 'expect':817,860,921,943,998 'explain':587,711,758 'explicit':645 'extern':490,496,633,701 'fail':41,97,106,138,151,182,187,202,308,312,472,604,689,727,793,841,1025,1036,1051,1068 'failur':57,177,235,243,314,684,839,1020 'fake':388 'file':120,167,241,432,502,506,556,682 'fire':542,908 'first':467,1072 'fix':11,76,611,616,618,739,753,761,925,972 'flaki':1,19,33,131,156,174,267,345,352,499,586,675,717,738,777,787,828 'follow':252,269 'format':671 'framework':324,553 'frequenc':315 'full':282 'function':297 'futur':779 'gather':274 'generat':523 'global':411 'hardcod':364 'helper':969 'hook':291 'http':492 'identifi':60,343,575 'immedi':1033 'in-memori':438 'includ':285,619 'inconsist':8 'inform':275 'input':782 'insid':861,1000 'instead':635 'instruct':217,254,258 'interfer':551 'intermitt':107 'isn':952 'isol':17,410,476,1071 'issu':18,359,554,856,1043,1064 'jest':867,883,909 'jest.clearallmocks':629 'jest.config.js':329 'jest.runalltimers':960 'jest.usefaketimers':390,640,962 'later':881 'limit':501 'line':582,734 'listen':540 'local':95,149,518,850,1035 'lock':538 'long':370 'longer':912 'loop':891 'low':564 'machin':1045 'make':78,593,602,721,745,987 'manual':965 'markdown':674 'may':367,508,900,910 'memori':440 'messag':178,244,304,1021 'miss':337,372,1048 'mix':391 'mock':631 'modifi':413 'modul':299 'move':886 'mutat':449 'name':680 'network':69,498 'never':996 'next':422 'non':765,956 'non-determin':764 'non-react':955 'note':1016 'number':522 'observ':198 'obvious':110 'often':185,687,846,1029 'one':415,427 'oper':376,535,668,1007 'order':14,68,355,457,489,700 'output':236,673,825 'pane':229 'pass':7,38,94,148,461,478,606,981,1034 'password':814,939 'past':223,238 'path':683 'pattern':58,111,316,353,685,840 'pinpoint':579 'poll':377,964 'prevent':767,995 'problemat':729,917 'process':271 'project':125,958 'promise.all':546 'prompt':257 'proper':664 'properti':448 'provid':163,230,612,1018 'put':997 'quot':731 'race':527,703 'ran':466 'random':70,521,657,661 'rate':500 'raw':1002 'reach':986 'react':957 'read':504 'real':393,495 'recommend':609 'record':424 'registerus':811,875,936 'registr':808,837,933 'relev':247,327 'reli':484 'repeat':975 'replac':620 'reset':419,455 'resili':990 'resolv':876 'retri':976 'reveal':1030 'root':62,126,693,706,851,857 'run':143,474,654,798,844,879,1054,1066 'seed':526,658 'send':803,832,928 'sentenc':710,757,769 'server':101 'servic':453 'set':562,655 'setinterv':362 'settimeout':360,621,816,862,920,1003 'setup':469 'share':66,407,468,697,1062 'short':387 'side':548,1011 'singleton':452 'skill':24,27,134,159,790 'skill-flaky-test-analyzer' 'slower':1044 'small':968 'snapshot':555 'sometim':37,40,318,598,726 'soon':873 'source-notysoty' 'specif':170,205,210,487,572,615,733,741 'state':16,67,408,698,1063 'static':446 'step':272,341,568,607,669 'subject':822,948 'suggest':10,74 'suit':481 'system':405,433,503 'target':75 'test':2,6,20,34,35,49,54,80,86,93,105,132,147,157,166,171,207,232,240,268,283,295,301,307,311,323,395,416,428,445,451,456,459,464,470,482,533,552,560,573,591,648,676,678,679,699,715,747,780,788,792,829,831,870,989,1057,1069 'test@example.com':813,821,938,947 'tests/checkout.test.ts':136 'time':13,65,189,211,356,401,545,644,696,854,994,1042 'timeout':383,561,984 'timer':389,394 'tohavebeencalledwith':819,923,945 'topic-agent-skills' 'topic-claude' 'topic-claude-code' 'topic-claude-skills' 'topic-cline' 'topic-cursor' 'topic-llm' 'topic-llm-skills' 'topic-skills' 'track':914 'tri':1065 'trigger':1009 'ts':801,919,926 'unpredict':544 'unreli':88 'use':82,114,129,154,639,646,785,959,1014 'user.test.ts':838 'usual':1040 'valu':662 'variabl':412,514,992,1050 'vi.runalltimers':626 'vitest.config.ts':330 'vs':605 'waitfor':624,942,951,974,1015 'welcom':805,823,834,930,949 'wipe':652 'without':42 'work':754,973 'wrong':599","prices":[{"id":"6c5ae86c-f671-40cd-96a2-5367b8a4efc7","listingId":"43476191-e1e3-4a08-a788-4c785370aa17","amountUsd":"0","unit":"free","nativeCurrency":null,"nativeAmount":null,"chain":null,"payTo":null,"paymentMethod":"skill-free","isPrimary":true,"details":{"org":"Notysoty","category":"openagentskills","install_from":"skills.sh"},"createdAt":"2026-05-18T13:20:42.657Z"}],"sources":[{"listingId":"43476191-e1e3-4a08-a788-4c785370aa17","source":"github","sourceId":"Notysoty/openagentskills/flaky-test-analyzer","sourceUrl":"https://github.com/Notysoty/openagentskills/tree/main/skills/flaky-test-analyzer","isPrimary":false,"firstSeenAt":"2026-05-18T13:20:42.657Z","lastSeenAt":"2026-05-18T19:13:21.749Z"}],"details":{"listingId":"43476191-e1e3-4a08-a788-4c785370aa17","quickStartSnippet":null,"exampleRequest":null,"exampleResponse":null,"schema":null,"openapiUrl":null,"agentsTxtUrl":null,"citations":[],"useCases":[],"bestFor":[],"notFor":[],"kindDetails":{"org":"Notysoty","slug":"flaky-test-analyzer","github":{"repo":"Notysoty/openagentskills","stars":8,"topics":["agent-skills","claude","claude-code","claude-skills","cline","cursor","llm","llm-skills","skills"],"license":"mit","html_url":"https://github.com/Notysoty/openagentskills","pushed_at":"2026-03-28T06:50:19Z","description":"A  community-driven library of reusable AI agent skills for Claude Code, Cursor, Codex, Cline, and more.","skill_md_sha":"48251e53274d0ca5124a2f9a16b503bd7cfa4ef3","skill_md_path":"skills/flaky-test-analyzer/SKILL.md","default_branch":"main","skill_tree_url":"https://github.com/Notysoty/openagentskills/tree/main/skills/flaky-test-analyzer"},"layout":"multi","source":"github","category":"openagentskills","frontmatter":{"name":"Flaky Test Analyzer","description":"Diagnoses why tests pass inconsistently and suggests fixes for timing, ordering, and state isolation issues."},"skills_sh_url":"https://skills.sh/Notysoty/openagentskills/flaky-test-analyzer"},"updatedAt":"2026-05-18T19:13:21.749Z"}}