{"id":"ea43b68d-9459-4b15-bb3b-7f85c78e9943","shortId":"a65EZt","kind":"skill","title":"evaluating-candidates","tagline":"Make evidence-based hiring decisions: scorecards, work samples, reference checks. See also: conducting-interviews (run interviews).","description":"# Evaluating Candidates\n\n## Scope\n\n**Covers**\n- Defining an explicit **hiring bar** (what “great” means for this role at this company, right now)\n- Turning interviews, work samples/trials, and references into **evidence**, not vibes\n- Designing **job-relevant** work samples (and **paid trials** when appropriate)\n- Running high-signal **reference checks** and integrating them into the decision\n- Producing a decision-ready recommendation with clear risks and mitigations\n\n**When to use**\n- “Help me decide whether to hire this candidate.”\n- “Create a scorecard and decision memo based on interview notes + references.”\n- “Design a work sample / take-home (or paid trial) and a scoring rubric.”\n- “Plan and run reference checks; give me a summary and recommendation.”\n- “Calibrate our hiring bar for a <role> and compare candidates fairly.”\n\n**When NOT to use**\n- You need to define the role outcomes or write the job description (use `writing-job-descriptions`)\n- You need to design/run structured interviews and question maps (use `conducting-interviews`)\n- You need to negotiate an offer or close a candidate (use `negotiating-offers`)\n- You need to build a sales team hiring pipeline or GTM hiring strategy (use `building-sales-team`)\n- You need legal/HR compliance guidance or to adjudicate high-risk employment issues (this skill is not legal advice)\n- You need compensation/offer negotiation strategy (use `negotiating-offers`)\n\n## Inputs\n\n**Minimum required**\n- Role + level + function (e.g., “Senior PM”, “Founding AE”, “Staff ML Engineer”)\n- Company/team context and “what’s hard” (stage, constraints, velocity expectations)\n- Evaluation criteria (4–8 competencies) and any non-negotiables / red flags\n- Candidate materials available (resume/portfolio + interview notes, if already interviewed)\n- Which signals you want to include: interviews, work sample/take-home, paid trial, references\n- Constraints: timeline, confidentiality/PII rules, internal-only vs shareable output\n\n**Missing-info strategy**\n- Ask up to 5 questions from [references/INTAKE.md](references/INTAKE.md) (3–5 at a time).\n- If criteria or notes are missing, propose a **default criteria set** and clearly label assumptions.\n- Do not request secrets. If notes contain sensitive info, ask for **redacted excerpts** or summaries.\n\n## Outputs (deliverables)\n\nProduce a **Candidate Evaluation Decision Pack** in Markdown (in-chat; or as files if requested):\n\n1) **Evaluation brief** (role success definition, criteria, weights, red flags)\n2) **Scorecard** (rating anchors + evidence capture)\n3) **Signal log** (all signals normalized into one table with evidence)\n4) **Work sample / take-home / paid trial plan + rubric** (if used)\n5) **Reference check kit** (outreach, script, note form, summary)\n6) **Candidate comparison** (if multiple candidates)\n7) **Hiring decision memo** (recommendation + risks + mitigations)\n8) **Risks / Open questions / Next steps** (always included)\n\nTemplates: [references/TEMPLATES.md](references/TEMPLATES.md)  \nExpanded guidance: [references/WORKFLOW.md](references/WORKFLOW.md)\n\n## Workflow (7 steps)\n\n### 1) Intake + decision framing\n- **Inputs:** user context; [references/INTAKE.md](references/INTAKE.md).\n- **Actions:** Confirm role, level, must-haves, and the decision timeline. Identify which signals exist vs need to be created (work sample, trial, references). Record constraints (PII, internal-only, fairness).\n- **Outputs:** Context snapshot + assumptions/unknowns list.\n- **Checks:** The decision and decision date are explicit (who decides, by when, using which signals).\n\n### 2) Define the bar + criteria (don’t improvise later)\n- **Inputs:** role context; existing rubric/values (if any).\n- **Actions:** Choose 4–8 criteria; define what “strong / acceptable / weak” looks like with observable anchors. Add explicit red flags. Decide whether to prioritize **raw ability + drive** vs “years of experience” for this role.\n- **Outputs:** Evaluation brief + draft scorecard.\n- **Checks:** Every criterion is measurable via evidence; no criterion is “vibe” or “culture fit” without definition.\n\n### 3) Build the signal plan + evidence log\n- **Inputs:** existing notes; planned stages.\n- **Actions:** Decide what each signal is responsible for (interviews = behavioral evidence; work sample = in-context execution; references = longitudinal performance). Create a single signal log so you can compare apples-to-apples.\n- **Outputs:** Signal plan + signal log table (empty or partially filled).\n- **Checks:** No single signal dominates by default; reference checks and work samples have defined weight when used.\n\n### 4) Design (or evaluate) the work sample / take-home / paid trial\n- **Inputs:** role outputs; constraints; candidate seniority.\n- **Actions:** Create a job-relevant task with clear deliverables and scoring rubric. If the task is >2–3 hours or resembles real work, prefer a **paid** trial and clarify IP/confidentiality boundaries.\n- **Outputs:** Work sample/trial brief + scoring rubric.\n- **Checks:** Task predicts real performance, is fair across backgrounds, and has objective scoring anchors.\n\n### 5) Run reference checks (highest-signal when done well)\n- **Inputs:** reference targets; outreach constraints; question bank.\n- **Actions:** Prioritize references who worked with the candidate for extended periods and in similar contexts. Ask for specific examples, deltas over time, strengths/limits, and “how would you staff them?” Capture verbatim evidence and calibrate for bias.\n- **Outputs:** Reference notes + reference summary.\n- **Checks:** Summary contains concrete examples and clear hire/no-hire signal, not generic praise.\n\n### 6) Synthesize signals → recommendation + risk mitigation\n- **Inputs:** scorecard, signal log, work sample results, reference summary.\n- **Actions:** Write a decision memo that cites evidence, calls out disagreements/uncertainty, and proposes mitigations (onboarding plan, coaching, 30/60/90 checkpoints) if hiring.\n- **Outputs:** Hiring decision memo + candidate comparison (if applicable).\n- **Checks:** Recommendation matches the weighted evidence; red flags are explicitly addressed.\n\n### 7) Quality gate + calibration + finalize pack\n- **Inputs:** full draft pack.\n- **Actions:** Run [references/CHECKLISTS.md](references/CHECKLISTS.md) and score with [references/RUBRIC.md](references/RUBRIC.md). Add **Risks / Open questions / Next steps**. If uncertain, propose the smallest additional signal to resolve (targeted reference, scoped trial, specific follow-up interview).\n- **Outputs:** Final Candidate Evaluation Decision Pack.\n- **Checks:** Evidence is sufficient for the decision; limitations and fairness risks are explicit.\n\n## Quality gate (required)\n- Use [references/CHECKLISTS.md](references/CHECKLISTS.md) and [references/RUBRIC.md](references/RUBRIC.md).\n- Always include: **Risks**, **Open questions**, **Next steps**.\n\n## Examples\n\n**Example 1 (final decision):** “Here are interview notes for a Senior PM candidate. Create a scorecard, summarize signals, and write a hiring decision memo. Include risks and suggested mitigations.”  \nExpected: scorecard with anchors + evidence, signal log, decision memo with explicit risks.\n\n**Example 2 (work sample + references):** “We’re hiring a Founding Engineer. Design a 2-day paid trial task and rubric, plus a reference check script. Then show how we should combine those signals into a hire/no-hire decision.”  \nExpected: trial brief + rubric, reference kit, and a synthesis framework.\n\n**Boundary example (insufficient signal):** “Tell me if this person is good. I only have their resume.”\nResponse: require criteria + at least one high-signal input (structured interview notes, work sample plan/results, or references); propose a minimal evaluation plan and list assumptions/unknowns.\n\n**Boundary example (redirect to interviews):** “Design a structured interview loop with behavioral questions for this PM role.”\nResponse: redirect to `conducting-interviews` — this skill evaluates candidates after signals are collected, it does not design interview questions or scripts.\n\n**Boundary example (redirect to offers):** “We've decided to hire this candidate. Help me structure the offer and negotiate.”\nResponse: redirect to `negotiating-offers` — this skill produces the hire/no-hire recommendation, not the offer strategy.\n\n## Anti-patterns (common failure modes)\n\n1. **”Gut feel” disguised as process** — Having a scorecard but filling it in retrospectively to justify a decision already made. Evidence must be captured before the overall recommendation is written.\n2. **Recency bias in signal weighting** — Over-weighting the most recent signal (e.g., a strong reference) while discounting earlier mixed interview signals. Use explicit weights defined before evaluation.\n3. **Work sample as unpaid labor** — Designing a take-home that takes 8+ hours, uses real company data without compensation, or has unclear IP ownership. Keep tasks under 3 hours or pay for longer trials.\n4. **Reference theater** — Accepting only candidate-provided references and treating generic praise (“great to work with”) as signal. Prioritize back-channel references and probe for specific examples + growth areas.\n5. **Consensus-seeking over evidence** — Running debriefs where the loudest voice wins or where the group converges on a comfortable middle. Require independent scoring before group discussion.","tags":["evaluating","candidates","lenny","skills","plus","liqiongyu","agent-skills","ai-agents","automation","claude","codex","prompt-engineering"],"capabilities":["skill","source-liqiongyu","skill-evaluating-candidates","topic-agent-skills","topic-ai-agents","topic-automation","topic-claude","topic-codex","topic-prompt-engineering","topic-refoundai","topic-skillpack"],"categories":["lenny_skills_plus"],"synonyms":[],"warnings":[],"endpointUrl":"https://skills.sh/liqiongyu/lenny_skills_plus/evaluating-candidates","protocol":"skill","transport":"skills-sh","auth":{"type":"none","details":{"cli":"npx skills add liqiongyu/lenny_skills_plus","source_repo":"https://github.com/liqiongyu/lenny_skills_plus","install_from":"skills.sh"}},"qualityScore":"0.474","qualityRationale":"deterministic score 0.47 from registry signals: · indexed on github topic:agent-skills · 49 github stars · SKILL.md body (9,431 chars)","verified":false,"liveness":"unknown","lastLivenessCheck":null,"agentReviews":{"count":0,"score_avg":null,"cost_usd_avg":null,"success_rate":null,"latency_p50_ms":null,"narrative_summary":null,"summary_updated_at":null},"enrichmentModel":"deterministic:skill-github:v1","enrichmentVersion":1,"enrichedAt":"2026-04-22T00:56:22.212Z","embedding":null,"createdAt":"2026-04-18T22:16:32.900Z","updatedAt":"2026-04-22T00:56:22.212Z","lastSeenAt":"2026-04-22T00:56:22.212Z","tsv":"'1':369,448,925,1134 '2':379,508,685,966,978,1164 '3':316,385,578,686,1193,1222 '30/60/90':822 '4':263,396,526,650,1229 '5':311,317,408,720,1260 '6':417,790 '7':423,446,845 '8':264,430,527,1206 'abil':548 'accept':532,1232 'across':713 'action':457,524,590,668,737,805,855 'add':539,864 'addit':875 'address':844 'adjud':216 'advic':227 'ae':247 'alreadi':280,1152 'also':16 'alway':436,916 'anchor':382,538,719,956 'anti':1129 'anti-pattern':1128 'appl':620,622 'apples-to-appl':619 'applic':833 'appropri':62 'area':1259 'ask':308,345,752 'assumpt':335 'assumptions/unknowns':491,1053 'avail':275 'back':1250 'back-channel':1249 'background':714 'bank':736 'bar':30,136,511 'base':7,103 'behavior':599,1065 'bias':772,1166 'boundari':699,1012,1054,1093 'brief':371,559,703,1004 'build':194,206,579 'building-sales-team':205 'calibr':133,770,848 'call':813 'candid':3,23,96,141,186,273,355,418,422,666,744,830,890,936,1080,1104,1235 'candidate-provid':1234 'captur':384,766,1157 'channel':1251 'chat':363 'check':14,68,126,410,493,562,633,641,706,723,778,834,894,988 'checkpoint':823 'choos':525 'cite':811 'clarifi':697 'clear':82,333,676,784 'close':184 'coach':821 'collect':1084 'combin':995 'comfort':1280 'common':1131 'compani':39,1210 'company/team':251 'compar':140,618 'comparison':419,831 'compens':1213 'compensation/offer':230 'compet':265 'complianc':212 'concret':781 'conduct':18,175,1075 'conducting-interview':17,174,1074 'confidentiality/pii':296 'confirm':458 'consensus':1262 'consensus-seek':1261 'constraint':258,294,482,665,734 'contain':342,780 'context':252,454,489,519,605,751 'converg':1277 'cover':25 'creat':97,476,610,669,937 'criteria':262,322,330,375,512,528,1030 'criterion':564,570 'cultur':574 'data':1211 'date':498 'day':979 'debrief':1267 'decid':91,502,543,591,1100 'decis':9,74,78,101,357,425,450,466,495,497,808,828,892,900,927,946,960,1001,1151 'decision-readi':77 'default':329,639 'defin':26,150,509,529,646,1190 'definit':374,577 'deliver':352,677 'delta':756 'descript':158,163 'design':52,108,651,976,1059,1088,1199 'design/run':167 'disagreements/uncertainty':815 'discount':1182 'discuss':1287 'disguis':1137 'domin':637 'done':728 'draft':560,853 'drive':549 'e.g':243,1177 'earlier':1183 'employ':220 'empti':629 'engin':250,975 'evalu':2,22,261,356,370,558,653,891,1049,1079,1192 'evaluating-candid':1 'everi':563 'evid':6,49,383,395,568,583,600,768,812,839,895,957,1154,1265 'evidence-bas':5 'exampl':755,782,923,924,965,1013,1055,1094,1257 'excerpt':348 'execut':606 'exist':471,520,586 'expand':441 'expect':260,953,1002 'experi':553 'explicit':28,500,540,843,906,963,1188 'extend':746 'failur':1132 'fair':142,487,712,903 'feel':1136 'file':366 'fill':632,1144 'final':849,889,926 'fit':575 'flag':272,378,542,841 'follow':885 'follow-up':884 'form':415 'found':246,974 'frame':451 'framework':1011 'full':852 'function':242 'gate':847,908 'generic':788,1240 'give':127 'good':1022 'great':32,1242 'group':1276,1286 'growth':1258 'gtm':201 'guidanc':213,442 'gut':1135 'hard':256 'have':463 'help':89,1105 'high':65,218,1035 'high-risk':217 'high-sign':64,1034 'highest':725 'highest-sign':724 'hire':8,29,94,135,198,202,424,825,827,945,972,1102 'hire/no-hire':785,1000,1122 'home':114,401,659,1203 'hour':687,1207,1223 'identifi':468 'improvis':515 'in-chat':361 'in-context':603 'includ':287,437,917,948 'independ':1283 'info':306,344 'input':237,452,517,585,662,730,796,851,1037 'insuffici':1014 'intak':449 'integr':70 'intern':299,485 'internal-on':298,484 'interview':19,21,43,105,169,176,277,281,288,598,887,930,1039,1058,1062,1076,1089,1185 'ip':1217 'ip/confidentiality':698 'issu':221 'job':54,157,162,672 'job-relev':53,671 'justifi':1149 'keep':1219 'kit':411,1007 'label':334 'labor':1198 'later':516 'least':1032 'legal':226 'legal/hr':211 'level':241,460 'like':535 'limit':901 'list':492,1052 'log':387,584,614,627,799,959 'longer':1227 'longitudin':608 'look':534 'loop':1063 'loudest':1270 'made':1153 'make':4 'map':172 'markdown':360 'match':836 'materi':274 'mean':33 'measur':566 'memo':102,426,809,829,947,961 'middl':1281 'minim':1048 'minimum':238 'miss':305,326 'missing-info':304 'mitig':85,429,795,818,952 'mix':1184 'ml':249 'mode':1133 'multipl':421 'must':462,1155 'must-hav':461 'need':148,165,178,192,210,229,473 'negoti':180,189,231,235,270,1111,1116 'negotiating-off':188,234,1115 'next':434,868,921 'non':269 'non-negoti':268 'normal':390 'note':106,278,324,341,414,587,775,931,1040 'object':717 'observ':537 'offer':182,190,236,1097,1109,1117,1126 'onboard':819 'one':392,1033 'open':432,866,919 'outcom':153 'output':303,351,488,557,623,664,700,773,826,888 'outreach':412,733 'over-weight':1170 'overal':1160 'ownership':1218 'pack':358,850,854,893 'paid':59,116,291,402,660,694,980 'partial':631 'pattern':1130 'pay':1225 'perform':609,710 'period':747 'person':1020 'pii':483 'pipelin':199 'plan':122,404,582,588,625,820,1050 'plan/results':1043 'plus':985 'pm':245,935,1069 'prais':789,1241 'predict':708 'prefer':692 'priorit':546,738,1248 'probe':1254 'process':1139 'produc':75,353,1120 'propos':327,817,872,1046 'provid':1236 'qualiti':846,907 'question':171,312,433,735,867,920,1066,1090 'rate':381 'raw':547 're':971 'readi':79 'real':690,709,1209 'recenc':1165 'recent':1175 'recommend':80,132,427,793,835,1123,1161 'record':481 'red':271,377,541,840 'redact':347 'redirect':1056,1072,1095,1113 'refer':13,47,67,107,125,293,409,480,607,640,722,731,739,774,776,803,880,969,987,1006,1045,1180,1230,1237,1252 'references/checklists.md':857,858,911,912 'references/intake.md':314,315,455,456 'references/rubric.md':862,863,914,915 'references/templates.md':439,440 'references/workflow.md':443,444 'relev':55,673 'request':338,368 'requir':239,909,1029,1282 'resembl':689 'resolv':878 'respons':596,1028,1071,1112 'result':802 'resum':1027 'resume/portfolio':276 'retrospect':1147 'right':40 'risk':83,219,428,431,794,865,904,918,949,964 'role':36,152,240,372,459,518,556,663,1070 'rubric':121,405,680,705,984,1005 'rubric/values':521 'rule':297 'run':20,63,124,721,856,1266 'sale':196,207 'sampl':12,57,111,398,478,602,644,656,801,968,1042,1195 'sample/take-home':290 'sample/trial':702 'samples/trials':45 'scope':24,881 'score':120,679,704,718,860,1284 'scorecard':10,99,380,561,797,939,954,1142 'script':413,989,1092 'secret':339 'see':15 'seek':1263 'senior':244,667,934 'sensit':343 'set':331 'shareabl':302 'show':991 'signal':66,283,386,389,470,507,581,594,613,624,626,636,726,786,792,798,876,941,958,997,1015,1036,1082,1168,1176,1186,1247 'similar':750 'singl':612,635 'skill':223,1078,1119 'skill-evaluating-candidates' 'smallest':874 'snapshot':490 'source-liqiongyu' 'specif':754,883,1256 'staff':248,764 'stage':257,589 'step':435,447,869,922 'strategi':203,232,307,1127 'strengths/limits':759 'strong':531,1179 'structur':168,1038,1061,1107 'success':373 'suffici':897 'suggest':951 'summar':940 'summari':130,350,416,777,779,804 'synthes':791 'synthesi':1010 'tabl':393,628 'take':113,400,658,1202,1205 'take-hom':112,399,657,1201 'target':732,879 'task':674,683,707,982,1220 'team':197,208 'tell':1016 'templat':438 'theater':1231 'time':320,758 'timelin':295,467 'topic-agent-skills' 'topic-ai-agents' 'topic-automation' 'topic-claude' 'topic-codex' 'topic-prompt-engineering' 'topic-refoundai' 'topic-skillpack' 'treat':1239 'trial':60,117,292,403,479,661,695,882,981,1003,1228 'turn':42 'uncertain':871 'unclear':1216 'unpaid':1197 'use':88,146,159,173,187,204,233,407,505,649,910,1187,1208 'user':453 've':1099 'veloc':259 'verbatim':767 'via':567 'vibe':51,572 'voic':1271 'vs':301,472,550 'want':285 'weak':533 'weight':376,647,838,1169,1172,1189 'well':729 'whether':92,544 'win':1272 'without':576,1212 'work':11,44,56,110,289,397,477,601,643,655,691,701,741,800,967,1041,1194,1244 'workflow':445 'would':762 'write':155,161,806,943 'writing-job-descript':160 'written':1163 'year':551","prices":[{"id":"52eb200d-1d35-4db2-9389-15481b0c571f","listingId":"ea43b68d-9459-4b15-bb3b-7f85c78e9943","amountUsd":"0","unit":"free","nativeCurrency":null,"nativeAmount":null,"chain":null,"payTo":null,"paymentMethod":"skill-free","isPrimary":true,"details":{"org":"liqiongyu","category":"lenny_skills_plus","install_from":"skills.sh"},"createdAt":"2026-04-18T22:16:32.900Z"}],"sources":[{"listingId":"ea43b68d-9459-4b15-bb3b-7f85c78e9943","source":"github","sourceId":"liqiongyu/lenny_skills_plus/evaluating-candidates","sourceUrl":"https://github.com/liqiongyu/lenny_skills_plus/tree/main/skills/evaluating-candidates","isPrimary":false,"firstSeenAt":"2026-04-18T22:16:32.900Z","lastSeenAt":"2026-04-22T00:56:22.212Z"}],"details":{"listingId":"ea43b68d-9459-4b15-bb3b-7f85c78e9943","quickStartSnippet":null,"exampleRequest":null,"exampleResponse":null,"schema":null,"openapiUrl":null,"agentsTxtUrl":null,"citations":[],"useCases":[],"bestFor":[],"notFor":[],"kindDetails":{"org":"liqiongyu","slug":"evaluating-candidates","github":{"repo":"liqiongyu/lenny_skills_plus","stars":49,"topics":["agent-skills","ai-agents","automation","claude","codex","prompt-engineering","refoundai","skillpack"],"license":"apache-2.0","html_url":"https://github.com/liqiongyu/lenny_skills_plus","pushed_at":"2026-04-04T06:30:11Z","description":"86 agent-executable skill packs converted from RefoundAI’s Lenny skills (unofficial). Works with Codex + Claude Code.","skill_md_sha":"02346539c019dec044dc891b615850682e495a38","skill_md_path":"skills/evaluating-candidates/SKILL.md","default_branch":"main","skill_tree_url":"https://github.com/liqiongyu/lenny_skills_plus/tree/main/skills/evaluating-candidates"},"layout":"multi","source":"github","category":"lenny_skills_plus","frontmatter":{"name":"evaluating-candidates","description":"Make evidence-based hiring decisions: scorecards, work samples, reference checks. See also: conducting-interviews (run interviews)."},"skills_sh_url":"https://skills.sh/liqiongyu/lenny_skills_plus/evaluating-candidates"},"updatedAt":"2026-04-22T00:56:22.212Z"}}