Skillquality 0.47

evaluating-candidates

Make evidence-based hiring decisions: scorecards, work samples, reference checks. See also: conducting-interviews (run interviews).

Price
free
Protocol
skill
Verified
no

What it does

Evaluating Candidates

Scope

Covers

  • Defining an explicit hiring bar (what “great” means for this role at this company, right now)
  • Turning interviews, work samples/trials, and references into evidence, not vibes
  • Designing job-relevant work samples (and paid trials when appropriate)
  • Running high-signal reference checks and integrating them into the decision
  • Producing a decision-ready recommendation with clear risks and mitigations

When to use

  • “Help me decide whether to hire this candidate.”
  • “Create a scorecard and decision memo based on interview notes + references.”
  • “Design a work sample / take-home (or paid trial) and a scoring rubric.”
  • “Plan and run reference checks; give me a summary and recommendation.”
  • “Calibrate our hiring bar for a <role> and compare candidates fairly.”

When NOT to use

  • You need to define the role outcomes or write the job description (use writing-job-descriptions)
  • You need to design/run structured interviews and question maps (use conducting-interviews)
  • You need to negotiate an offer or close a candidate (use negotiating-offers)
  • You need to build a sales team hiring pipeline or GTM hiring strategy (use building-sales-team)
  • You need legal/HR compliance guidance or to adjudicate high-risk employment issues (this skill is not legal advice)
  • You need compensation/offer negotiation strategy (use negotiating-offers)

Inputs

Minimum required

  • Role + level + function (e.g., “Senior PM”, “Founding AE”, “Staff ML Engineer”)
  • Company/team context and “what’s hard” (stage, constraints, velocity expectations)
  • Evaluation criteria (4–8 competencies) and any non-negotiables / red flags
  • Candidate materials available (resume/portfolio + interview notes, if already interviewed)
  • Which signals you want to include: interviews, work sample/take-home, paid trial, references
  • Constraints: timeline, confidentiality/PII rules, internal-only vs shareable output

Missing-info strategy

  • Ask up to 5 questions from references/INTAKE.md (3–5 at a time).
  • If criteria or notes are missing, propose a default criteria set and clearly label assumptions.
  • Do not request secrets. If notes contain sensitive info, ask for redacted excerpts or summaries.

Outputs (deliverables)

Produce a Candidate Evaluation Decision Pack in Markdown (in-chat; or as files if requested):

  1. Evaluation brief (role success definition, criteria, weights, red flags)
  2. Scorecard (rating anchors + evidence capture)
  3. Signal log (all signals normalized into one table with evidence)
  4. Work sample / take-home / paid trial plan + rubric (if used)
  5. Reference check kit (outreach, script, note form, summary)
  6. Candidate comparison (if multiple candidates)
  7. Hiring decision memo (recommendation + risks + mitigations)
  8. Risks / Open questions / Next steps (always included)

Templates: references/TEMPLATES.md
Expanded guidance: references/WORKFLOW.md

Workflow (7 steps)

1) Intake + decision framing

  • Inputs: user context; references/INTAKE.md.
  • Actions: Confirm role, level, must-haves, and the decision timeline. Identify which signals exist vs need to be created (work sample, trial, references). Record constraints (PII, internal-only, fairness).
  • Outputs: Context snapshot + assumptions/unknowns list.
  • Checks: The decision and decision date are explicit (who decides, by when, using which signals).

2) Define the bar + criteria (don’t improvise later)

  • Inputs: role context; existing rubric/values (if any).
  • Actions: Choose 4–8 criteria; define what “strong / acceptable / weak” looks like with observable anchors. Add explicit red flags. Decide whether to prioritize raw ability + drive vs “years of experience” for this role.
  • Outputs: Evaluation brief + draft scorecard.
  • Checks: Every criterion is measurable via evidence; no criterion is “vibe” or “culture fit” without definition.

3) Build the signal plan + evidence log

  • Inputs: existing notes; planned stages.
  • Actions: Decide what each signal is responsible for (interviews = behavioral evidence; work sample = in-context execution; references = longitudinal performance). Create a single signal log so you can compare apples-to-apples.
  • Outputs: Signal plan + signal log table (empty or partially filled).
  • Checks: No single signal dominates by default; reference checks and work samples have defined weight when used.

4) Design (or evaluate) the work sample / take-home / paid trial

  • Inputs: role outputs; constraints; candidate seniority.
  • Actions: Create a job-relevant task with clear deliverables and scoring rubric. If the task is >2–3 hours or resembles real work, prefer a paid trial and clarify IP/confidentiality boundaries.
  • Outputs: Work sample/trial brief + scoring rubric.
  • Checks: Task predicts real performance, is fair across backgrounds, and has objective scoring anchors.

5) Run reference checks (highest-signal when done well)

  • Inputs: reference targets; outreach constraints; question bank.
  • Actions: Prioritize references who worked with the candidate for extended periods and in similar contexts. Ask for specific examples, deltas over time, strengths/limits, and “how would you staff them?” Capture verbatim evidence and calibrate for bias.
  • Outputs: Reference notes + reference summary.
  • Checks: Summary contains concrete examples and clear hire/no-hire signal, not generic praise.

6) Synthesize signals → recommendation + risk mitigation

  • Inputs: scorecard, signal log, work sample results, reference summary.
  • Actions: Write a decision memo that cites evidence, calls out disagreements/uncertainty, and proposes mitigations (onboarding plan, coaching, 30/60/90 checkpoints) if hiring.
  • Outputs: Hiring decision memo + candidate comparison (if applicable).
  • Checks: Recommendation matches the weighted evidence; red flags are explicitly addressed.

7) Quality gate + calibration + finalize pack

  • Inputs: full draft pack.
  • Actions: Run references/CHECKLISTS.md and score with references/RUBRIC.md. Add Risks / Open questions / Next steps. If uncertain, propose the smallest additional signal to resolve (targeted reference, scoped trial, specific follow-up interview).
  • Outputs: Final Candidate Evaluation Decision Pack.
  • Checks: Evidence is sufficient for the decision; limitations and fairness risks are explicit.

Quality gate (required)

Examples

Example 1 (final decision): “Here are interview notes for a Senior PM candidate. Create a scorecard, summarize signals, and write a hiring decision memo. Include risks and suggested mitigations.”
Expected: scorecard with anchors + evidence, signal log, decision memo with explicit risks.

Example 2 (work sample + references): “We’re hiring a Founding Engineer. Design a 2-day paid trial task and rubric, plus a reference check script. Then show how we should combine those signals into a hire/no-hire decision.”
Expected: trial brief + rubric, reference kit, and a synthesis framework.

Boundary example (insufficient signal): “Tell me if this person is good. I only have their resume.” Response: require criteria + at least one high-signal input (structured interview notes, work sample plan/results, or references); propose a minimal evaluation plan and list assumptions/unknowns.

Boundary example (redirect to interviews): “Design a structured interview loop with behavioral questions for this PM role.” Response: redirect to conducting-interviews — this skill evaluates candidates after signals are collected, it does not design interview questions or scripts.

Boundary example (redirect to offers): “We've decided to hire this candidate. Help me structure the offer and negotiate.” Response: redirect to negotiating-offers — this skill produces the hire/no-hire recommendation, not the offer strategy.

Anti-patterns (common failure modes)

  1. ”Gut feel” disguised as process — Having a scorecard but filling it in retrospectively to justify a decision already made. Evidence must be captured before the overall recommendation is written.
  2. Recency bias in signal weighting — Over-weighting the most recent signal (e.g., a strong reference) while discounting earlier mixed interview signals. Use explicit weights defined before evaluation.
  3. Work sample as unpaid labor — Designing a take-home that takes 8+ hours, uses real company data without compensation, or has unclear IP ownership. Keep tasks under 3 hours or pay for longer trials.
  4. Reference theater — Accepting only candidate-provided references and treating generic praise (“great to work with”) as signal. Prioritize back-channel references and probe for specific examples + growth areas.
  5. Consensus-seeking over evidence — Running debriefs where the loudest voice wins or where the group converges on a comfortable middle. Require independent scoring before group discussion.

Capabilities

skillsource-liqiongyuskill-evaluating-candidatestopic-agent-skillstopic-ai-agentstopic-automationtopic-claudetopic-codextopic-prompt-engineeringtopic-refoundaitopic-skillpack

Install

Quality

0.47/ 1.00

deterministic score 0.47 from registry signals: · indexed on github topic:agent-skills · 49 github stars · SKILL.md body (9,431 chars)

Provenance

Indexed fromgithub
Enriched2026-04-22 00:56:22Z · deterministic:skill-github:v1 · v1
First seen2026-04-18
Last seen2026-04-22

Agent access