{"id":"a73bc8b4-540d-43e1-9d62-345f36be24f1","shortId":"n9Y3pK","kind":"skill","title":"causal-inference-mixtape","tagline":"This skill should be used when the user asks to \"implement a DiD regression\", \"write a causal inference pipeline\", \"set up an event study\", \"implement instrumental variables\", \"run a regression discontinuity design\", \"build a synthetic control model\", \"implement propensity score ","description":"# Causal Inference: The Mixtape — Code Skill\n\nPractitioner-oriented causal inference skill built from Scott Cunningham's *Causal Inference: The Mixtape* repository. Covers 10 identification strategies with ready-to-run code templates in Python, R, and Stata.\n\n---\n\n## Methods Covered\n\n| Method | Python | R | Stata | Reference |\n|--------|--------|---|-------|-----------|\n| OLS / Regression | statsmodels | estimatr | reg/reghdfe | `references/method-patterns.md` §1 |\n| Difference-in-Differences | statsmodels + C() | lfe/fixest | xtreg/reghdfe | `references/method-patterns.md` §2 |\n| Event Study (Dynamic DiD) | manual lead/lag | estimatr | reghdfe | `references/method-patterns.md` §3 |\n| Staggered DiD / TWFE | statsmodels | bacondecomp | bacondecomp | `references/method-patterns.md` §4 |\n| Regression Discontinuity | statsmodels polynomial | rdrobust | rdplot/rdrobust | `references/method-patterns.md` §5 |\n| Instrumental Variables | linearmodels IV2SLS | AER/ivreg | ivregress 2sls | `references/method-patterns.md` §6 |\n| Synthetic Control | rpy2 → R Synth | Synth + SCtools | synth | `references/method-patterns.md` §7 |\n| Matching / PSM / IPW | manual logit + weights | MatchIt + Zelig | teffects/cem | `references/method-patterns.md` §8 |\n| DAGs / Collider Bias | dagitty (conceptual) | dagitty/ggdag | — | `references/method-patterns.md` §9 |\n| Randomization Inference | permutation loop | ri2 | ritest | `references/method-patterns.md` §10 |\n\n---\n\n## Core Workflow\n\n### Implement a Causal Method\n\n1. Identify the method from the table above\n2. Load the appropriate template from `references/method-patterns.md`\n3. Adapt variable names, fixed effects, and clustering to the user's data\n4. Add robustness checks (parallel trends for DiD, McCrary for RDD, first-stage F for IV)\n\n### Choose the Right Language\n\n| Scenario | Recommendation |\n|----------|---------------|\n| ML pipeline integration | Python (statsmodels + linearmodels) |\n| Synthetic Control | R (Synth package) or Stata (synth) — Python lacks mature implementation |\n| Bacon decomposition | R (bacondecomp) or Stata — no Python equivalent |\n| Publication-ready tables | Stata (outreg2/esttab) or R (stargazer/modelsummary) |\n| Coarsened Exact Matching | Stata (cem) or R (MatchIt) — no Python equivalent |\n| Quick prototyping | Python with statsmodels |\n\n### Cross-Language Equivalents\n\n| Task | Python | R | Stata |\n|------|--------|---|-------|\n| OLS with robust SE | `smf.ols().fit(cov_type='HC1')` | `lm_robust()` | `reg y x, robust` |\n| Cluster SE | `fit(cov_type='cluster', cov_kwds={'groups': g})` | `felm(y ~ x | 0 | 0 | cluster)` | `reg y x, cluster(id)` |\n| Two-way FE | `C(id) + C(time)` in formula | `felm(y ~ x | id + time)` | `reghdfe y x, absorb(id time)` |\n| IV / 2SLS | `IV2SLS.from_formula('y ~ 1 + exog + [endog ~ inst]')` | `ivreg(y ~ exog | inst)` | `ivregress 2sls y exog (endog = inst)` |\n| DiD | `C(treat)*C(post)` | `treat:post` in formula | `did_multiplegt` or interaction |\n\n---\n\n## Key Python Patterns\n\n### DiD with Cluster-Robust SE\n\n```python\nimport statsmodels.formula.api as smf\n\nmodel = smf.ols('y ~ C(treated)*C(post) + controls', data=df)\nresults = model.fit(cov_type='cluster', cov_kwds={'groups': df['firm_id']})\n```\n\n### Event Study (Lead/Lag)\n\n```python\n# Create relative time dummies\nfor k in range(-4, 5):\n    col = f'rel_{k}' if k >= 0 else f'rel_m{abs(k)}'\n    df[col] = (df['relative_time'] == k).astype(int)\n\n# Drop t=-1 as reference\nformula = 'y ~ ' + ' + '.join([c for c in rel_cols if c != 'rel_m1']) + ' + C(id) + C(year)'\n```\n\n### IV / 2SLS\n\n```python\nfrom linearmodels.iv import IV2SLS\n\nmodel = IV2SLS.from_formula('y ~ 1 + exog + [endog ~ instrument]', data=df)\nresults = model.fit(cov_type='clustered', clusters=df['cluster_var'])\n```\n\n---\n\n## Robustness Check Patterns\n\n| Method | Required Checks |\n|--------|----------------|\n| DiD | Parallel trends (event study plot), placebo treatment dates |\n| RDD | McCrary density test, bandwidth robustness (half/double IK optimal), polynomial robustness |\n| IV | First-stage F > 10, exclusion restriction argument, over-identification test |\n| Synthetic Control | Pre-treatment RMSPE, placebo distribution, leave-one-out |\n| Matching | Covariate balance table, caliper sensitivity |\n\n---\n\n## Common Pitfalls\n\n1. **TWFE with staggered treatment** — standard two-way FE is biased when treatment timing varies. Use Bacon decomposition or Sun & Abraham / Callaway & Sant'Anna estimators.\n2. **Synthetic Control with many treated units** — the Synth package handles one treated unit. For multiple, use augmented synthetic control or stacked approach.\n3. **RDD without McCrary test** — always test for manipulation at the cutoff before estimating.\n4. **IV weak instruments** — report first-stage F-statistic. Below 10 indicates weak instrument bias.\n5. **Python Synth gap** — no mature Python Synth package exists. Use `rpy2` to call R's `Synth` from Python.\n\n---\n\n## Additional Resources\n\n### Reference Files\n\n- **`references/method-patterns.md`** — Detailed code templates for all 10 methods with full examples\n- **`references/r-stata-comparison.md`** — Cross-language package comparison and method coverage gaps\n\n### Prompt Files\n\n- **`prompts/01-implement-method.md`** — Copy-paste prompt for implementing any causal method\n- **`prompts/02-robustness-checks.md`** — Copy-paste prompt for generating robustness check code","tags":["jill0099","causal","inference","mixtape","awesome","agent","skills","for","empirical","research","brycewang-stanford","academic-research"],"capabilities":["skill","source-brycewang-stanford","skill-10-jill0099-causal-inference-mixtape","topic-academic-research","topic-agent-skills","topic-ai-agent","topic-awesome-list","topic-communication","topic-copaper","topic-economics","topic-education","topic-empirical-research","topic-international-relations","topic-political-science","topic-psychology"],"categories":["Awesome-Agent-Skills-for-Empirical-Research"],"synonyms":[],"warnings":[],"endpointUrl":"https://skills.sh/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research/10-Jill0099-causal-inference-mixtape","protocol":"skill","transport":"skills-sh","auth":{"type":"none","details":{"cli":"npx skills add brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research","source_repo":"https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research","install_from":"skills.sh"}},"qualityScore":"0.700","qualityRationale":"deterministic score 0.70 from registry signals: · indexed on github topic:agent-skills · 598 github stars · SKILL.md body (5,391 chars)","verified":false,"liveness":"unknown","lastLivenessCheck":null,"agentReviews":{"count":0,"score_avg":null,"cost_usd_avg":null,"success_rate":null,"latency_p50_ms":null,"narrative_summary":null,"summary_updated_at":null},"enrichmentModel":"deterministic:skill-github:v1","enrichmentVersion":1,"enrichedAt":"2026-05-02T12:52:55.996Z","embedding":null,"createdAt":"2026-04-18T22:12:42.277Z","updatedAt":"2026-05-02T12:52:55.996Z","lastSeenAt":"2026-05-02T12:52:55.996Z","tsv":"'-1':457 '-4':432 '0':324,325,440 '1':96,185,358,488,562 '10':68,178,534,637,671 '2':106,193,588 '2sls':139,354,367,478 '3':116,200,611 '4':124,213,625 '5':132,433,642 '6':141 '7':151 '8':162 '9':170 'ab':445 'abraham':583 'absorb':350 'adapt':201 'add':214 'addit':661 'aer/ivreg':137 'alway':616 'anna':586 'approach':610 'appropri':196 'argument':537 'ask':13 'astyp':453 'augment':605 'bacon':254,579 'bacondecomp':121,122,257 'balanc':556 'bandwidth':522 'bias':165,573,641 'build':37 'built':57 'c':102,336,338,373,375,402,404,463,465,470,473,475 'calip':558 'call':655 'callaway':584 'causal':2,21,45,54,62,183,696 'causal-inference-mixtap':1 'cem':276 'check':216,504,508,706 'choos':230 'cluster':207,311,316,326,330,391,413,498,499,501 'cluster-robust':390 'coarsen':272 'code':49,76,667,707 'col':434,448,468 'collid':164 'common':560 'comparison':681 'conceptu':167 'control':40,143,243,406,543,590,607 'copi':690,700 'copy-past':689,699 'core':179 'cov':302,314,317,411,414,496 'covari':555 'cover':67,84 'coverag':684 'creat':424 'cross':289,678 'cross-languag':288,677 'cunningham':60 'cutoff':622 'dag':163 'dagitti':166 'dagitty/ggdag':168 'data':212,407,492 'date':517 'decomposit':255,580 'densiti':520 'design':36 'detail':666 'df':408,417,447,449,493,500 'differ':98,100 'difference-in-differ':97 'discontinu':35,126 'distribut':549 'drop':455 'dummi':427 'dynam':109 'effect':205 'els':441 'endog':360,370,490 'equival':262,282,291 'estim':587,624 'estimatr':93,113 'event':27,107,420,512 'exact':273 'exampl':675 'exclus':535 'exist':651 'exog':359,364,369,489 'f':227,435,442,533,634 'f-statist':633 'fe':335,571 'felm':321,342 'file':664,687 'firm':418 'first':225,531,631 'first-stag':224,530,630 'fit':301,313 'fix':204 'formula':341,356,380,460,486 'full':674 'g':320 'gap':645,685 'generat':704 'group':319,416 'half/double':524 'handl':598 'hc1':304 'id':331,337,345,351,419,474 'identif':69,540 'identifi':186 'ik':525 'implement':15,29,42,181,253,694 'import':395,482 'indic':638 'infer':3,22,46,55,63,172 'inst':361,365,371 'instrument':30,133,491,628,640 'int':454 'integr':238 'interact':384 'ipw':154 'iv':229,353,477,529,626 'iv2sls':136,483 'iv2sls.from':355,485 'ivreg':362 'ivregress':138,366 'join':462 'k':429,437,439,446,452 'key':385 'kwds':318,415 'lack':251 'languag':233,290,679 'lead/lag':112,422 'leav':551 'leave-one-out':550 'lfe/fixest':103 'linearmodel':135,241 'linearmodels.iv':481 'lm':305 'load':194 'logit':156 'loop':174 'm':444 'm1':472 'mani':592 'manipul':619 'manual':111,155 'match':152,274,554 'matchit':158,279 'matur':252,647 'mccrari':221,519,614 'method':83,85,184,188,506,672,683,697 'mixtap':4,48,65 'ml':236 'model':41,399,484 'model.fit':410,495 'multipl':603 'multiplegt':382 'name':203 'ol':90,296 'one':552,599 'optim':526 'orient':53 'outreg2/esttab':268 'over-identif':538 'packag':246,597,650,680 'parallel':217,510 'past':691,701 'pattern':387,505 'permut':173 'pipelin':23,237 'pitfal':561 'placebo':515,548 'plot':514 'polynomi':128,527 'post':376,378,405 'practition':52 'practitioner-ori':51 'pre':545 'pre-treat':544 'prompt':686,692,702 'prompts/01-implement-method.md':688 'prompts/02-robustness-checks.md':698 'propens':43 'prototyp':284 'psm':153 'public':264 'publication-readi':263 'python':79,86,239,250,261,281,285,293,386,394,423,479,643,648,660 'quick':283 'r':80,87,145,244,256,270,278,294,656 'random':171 'rang':431 'rdd':223,518,612 'rdplot/rdrobust':130 'rdrobust':129 'readi':73,265 'ready-to-run':72 'recommend':235 'refer':89,459,663 'references/method-patterns.md':95,105,115,123,131,140,150,161,169,177,199,665 'references/r-stata-comparison.md':676 'reg':307,327 'reg/reghdfe':94 'reghdf':114,347 'regress':18,34,91,125 'rel':436,443,467,471 'relat':425,450 'report':629 'repositori':66 'requir':507 'resourc':662 'restrict':536 'result':409,494 'ri2':175 'right':232 'ritest':176 'rmspe':547 'robust':215,298,306,310,392,503,523,528,705 'rpy2':144,653 'run':32,75 'sant':585 'scenario':234 'score':44 'scott':59 'sctool':148 'se':299,312,393 'sensit':559 'set':24 'skill':6,50,56 'skill-10-jill0099-causal-inference-mixtape' 'smf':398 'smf.ols':300,400 'source-brycewang-stanford' 'stack':609 'stage':226,532,632 'stagger':117,565 'standard':567 'stargazer/modelsummary':271 'stata':82,88,248,259,267,275,295 'statist':635 'statsmodel':92,101,120,127,240,287 'statsmodels.formula.api':396 'strategi':70 'studi':28,108,421,513 'sun':582 'synth':146,147,149,245,249,596,644,649,658 'synthet':39,142,242,542,589,606 'tabl':191,266,557 'task':292 'teffects/cem':160 'templat':77,197,668 'test':521,541,615,617 'time':339,346,352,426,451,576 'topic-academic-research' 'topic-agent-skills' 'topic-ai-agent' 'topic-awesome-list' 'topic-communication' 'topic-copaper' 'topic-economics' 'topic-education' 'topic-empirical-research' 'topic-international-relations' 'topic-political-science' 'topic-psychology' 'treat':374,377,403,593,600 'treatment':516,546,566,575 'trend':218,511 'twfe':119,563 'two':333,569 'two-way':332,568 'type':303,315,412,497 'unit':594,601 'use':9,578,604,652 'user':12,210 'var':502 'vari':577 'variabl':31,134,202 'way':334,570 'weak':627,639 'weight':157 'without':613 'workflow':180 'write':19 'x':309,323,329,344,349 'xtreg/reghdfe':104 'y':308,322,328,343,348,357,363,368,401,461,487 'year':476 'zelig':159","prices":[{"id":"922e8a77-6383-4660-b2de-829658d8c216","listingId":"a73bc8b4-540d-43e1-9d62-345f36be24f1","amountUsd":"0","unit":"free","nativeCurrency":null,"nativeAmount":null,"chain":null,"payTo":null,"paymentMethod":"skill-free","isPrimary":true,"details":{"org":"brycewang-stanford","category":"Awesome-Agent-Skills-for-Empirical-Research","install_from":"skills.sh"},"createdAt":"2026-04-18T22:12:42.277Z"}],"sources":[{"listingId":"a73bc8b4-540d-43e1-9d62-345f36be24f1","source":"github","sourceId":"brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research/10-Jill0099-causal-inference-mixtape","sourceUrl":"https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research/tree/main/skills/10-Jill0099-causal-inference-mixtape","isPrimary":false,"firstSeenAt":"2026-04-18T22:12:42.277Z","lastSeenAt":"2026-05-02T12:52:55.996Z"}],"details":{"listingId":"a73bc8b4-540d-43e1-9d62-345f36be24f1","quickStartSnippet":null,"exampleRequest":null,"exampleResponse":null,"schema":null,"openapiUrl":null,"agentsTxtUrl":null,"citations":[],"useCases":[],"bestFor":[],"notFor":[],"kindDetails":{"org":"brycewang-stanford","slug":"10-Jill0099-causal-inference-mixtape","github":{"repo":"brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research","stars":598,"topics":["academic-research","agent-skills","ai-agent","awesome-list","communication","copaper","economics","education","empirical-research","international-relations","political-science","psychology","public-administration","reproducible-research","skills-library","social-science","sociology"],"license":"other","html_url":"https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research","pushed_at":"2026-04-30T20:01:12Z","description":"🔬 A curated collection of 23,000+ agent skills for empirical research across 8 social science disciplines. | 精选 23,000+ AI Agent 技能库，覆盖8大社会科学学科的实证研究。CoPaper.AI 20分钟完成一篇可复现的规范实证论文，并支持用户上传 Skills。-- Maintained by CoPaper.AI from Stanford REAP.","skill_md_sha":"a8a4ec7892249432ee52a82846c81c87656ab6c6","skill_md_path":"skills/10-Jill0099-causal-inference-mixtape/SKILL.md","default_branch":"main","skill_tree_url":"https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research/tree/main/skills/10-Jill0099-causal-inference-mixtape"},"layout":"multi","source":"github","category":"Awesome-Agent-Skills-for-Empirical-Research","frontmatter":{"name":"causal-inference-mixtape","description":"This skill should be used when the user asks to \"implement a DiD regression\", \"write a causal inference pipeline\", \"set up an event study\", \"implement instrumental variables\", \"run a regression discontinuity design\", \"build a synthetic control model\", \"implement propensity score matching\", \"write parallel trends test\", \"implement Bacon decomposition\", or needs code templates for causal inference methods in Python, R, or Stata. Based on Scott Cunningham''s Causal Inference: The Mixtape."},"skills_sh_url":"https://skills.sh/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research/10-Jill0099-causal-inference-mixtape"},"updatedAt":"2026-05-02T12:52:55.996Z"}}