{"id":"b8d50dc1-7233-44bf-8549-7018f0b90d1c","shortId":"2Vnz42","kind":"skill","title":"evaluating-new-technology","tagline":"Create a Technology Evaluation Pack (problem framing, options matrix, build vs buy, pilot plan, decision memo). See also: evaluating-trade-offs (general decisions).","description":"# Evaluating New Technology\n\n## Scope\n\n**Covers**\n- Evaluating a new tool/platform/vendor (including AI products) for adoption\n- Emerging tech “should we use this?” decisions\n- Build vs buy decisions and tech stack changes\n- Running a proof-of-value pilot and capturing evidence\n- First-pass risk review (security/privacy/compliance, vendor claims, operational readiness)\n\n**When to use**\n- “Evaluate this new AI tool/vendor for our team.”\n- “Should we build this in-house or buy a vendor?”\n- “We’re considering changing our analytics/experimentation stack—make a recommendation.”\n- “Create a technology evaluation doc with a pilot plan, risks, and decision memo.”\n\n**When NOT to use**\n- You don’t have a real problem/job to solve yet (use `problem-definition` first).\n- You need a full product strategy/roadmap (use `ai-product-strategy`).\n- You’re designing how to build an LLM system (use `building-with-llms`).\n- You need a formal security assessment / penetration testing (engage security; this skill produces a structured first pass).\n- You are weighing trade-offs within an existing design or architecture decision (use `evaluating-trade-offs`).\n- You need to manage or pay down existing technical debt rather than adopt something new (use `managing-tech-debt`).\n- You already chose the technology and need an implementation/migration plan (use `managing-tech-debt` or `platform-infrastructure`).\n\n## Inputs\n\n**Minimum required**\n- Candidate technology (what it is, vendor/build option, links if available)\n- Problem/workflow to improve + who it’s for\n- Current approach/stack and what’s not working\n- Constraints: data sensitivity, privacy/compliance, budget, timeline, regions, deployment model (SaaS/on-prem)\n- Decision context: who decides, adoption scope, risk tolerance\n\n**Missing-info strategy**\n- Ask up to 5 questions from [references/INTAKE.md](references/INTAKE.md) (3–5 at a time).\n- If still missing, proceed with explicit assumptions and present 2–3 options (e.g., buy vs build vs defer).\n- Do not request secrets. If asked to run tools, change production systems, or sign up for vendors, require explicit confirmation.\n\n## Outputs (deliverables)\n\nProduce a **Technology Evaluation Pack** (in chat; or as files if requested), in this order:\n\n1) **Evaluation brief** (problem, stakeholders, decision, constraints, non-goals, assumptions)\n2) **Options & criteria matrix** (status quo + alternatives, criteria, scoring, notes)\n3) **Build vs buy analysis** (bandwidth/TCO, core competency, opportunity cost, lock-in)\n4) **Pilot (proof-of-value) plan** (hypotheses, scope, metrics, timeline, exit criteria)\n5) **Risk & guardrails review** (security/privacy/compliance, vendor claims, mitigations)\n6) **Decision memo** (recommendation, rationale, trade-offs, adoption/rollback plan)\n7) **Risks / Open questions / Next steps** (always included)\n\nTemplates: [references/TEMPLATES.md](references/TEMPLATES.md)\n\n## Workflow (8 steps)\n\n### 1) Start with the problem (avoid tool bias)\n- **Inputs:** Candidate tech, target workflow/users, current pain.\n- **Actions:** Write a one-sentence problem statement and “who feels it.” List 3–5 symptoms and 3–5 non-goals.\n- **Outputs:** Draft **Evaluation brief** (problem + non-goals).\n- **Checks:** You can explain the decision without naming the tool.\n\n### 2) Define “good” and hard constraints\n- **Inputs:** Success metrics, constraints, risk tolerance, decision deadline.\n- **Actions:** Define success metrics (leading + lagging) and must-have constraints (privacy, compliance, security, uptime, latency/cost if relevant). Capture “deal breakers.”\n- **Outputs:** **Evaluation brief** (success + constraints + deal breakers).\n- **Checks:** A stakeholder can say what would make this a clear “yes” or “no.”\n\n### 3) Map options and evaluation criteria (workflows → ROI)\n- **Inputs:** Current stack, alternatives, stakeholders.\n- **Actions:** List options: status quo, 1–3 vendors, build, hybrid. Define criteria anchored to workflows enabled and ROI (time saved, revenue impact, risk reduction), not feature checklists.\n- **Outputs:** **Options & criteria matrix**.\n- **Checks:** Every criterion is measurable or at least falsifiable in a pilot.\n\n### 4) Fast reality check: integration + data fit\n- **Inputs:** Architecture constraints, data sources, integration points.\n- **Actions:** Identify required integrations (SSO, data pipelines, APIs, logs). Note migration complexity, data ownership, and export/exit path. For PLG/growth tools, sanity-check the stack layers (data hub → analytics → lifecycle).\n- **Outputs:** Notes added to **Options & criteria matrix** (integration complexity + stack fit).\n- **Checks:** You can describe the end-to-end data/control flow in 5–10 bullets.\n\n### 5) Build vs buy with “bandwidth” as a first-class cost\n- **Inputs:** Engineering capacity, core competencies, opportunity cost.\n- **Actions:** Compare build vs buy using a bandwidth/TCO ledger (build time, maintenance, on-call, upgrades, vendor management). Prefer building only when it’s a core differentiator or the vendor market is immature/unacceptable.\n- **Outputs:** **Build vs buy analysis**.\n- **Checks:** The analysis includes opportunity cost and who would maintain the system 12 months from now.\n\n### 6) Risk & guardrails review (be skeptical of “100% safe” claims)\n- **Inputs:** Data sensitivity, threat model, vendor posture, deployment model.\n- **Actions:** Identify key risks (security, privacy, compliance, reliability, lock-in). For AI vendors: treat “guardrails catch everything” claims as marketing; assume determined attackers exist and design defense-in-depth (permissions, logging, human approval points, eval/red-team).\n- **Outputs:** **Risk & guardrails review**.\n- **Checks:** Each top risk has an owner and a mitigation or a “blocker” label.\n\n### 7) Plan a proof-of-value pilot (or document why you can skip it)\n- **Inputs:** Criteria, risks, timeline, stakeholders.\n- **Actions:** Define pilot hypotheses, scope, success metrics, test dataset, and evaluation method. Specify timeline, resourcing, and exit criteria (adopt / iterate / reject). Include rollback and data deletion requirements.\n- **Outputs:** **Pilot plan**.\n- **Checks:** A team can run the pilot without extra meetings; success/failure is unambiguous.\n\n### 8) Decide, communicate, and quality-gate\n- **Inputs:** Completed pack drafts.\n- **Actions:** Write the **Decision memo** with recommendation, trade-offs, and adoption plan. Run [references/CHECKLISTS.md](references/CHECKLISTS.md) and score with [references/RUBRIC.md](references/RUBRIC.md). Always include **Risks / Open questions / Next steps**.\n- **Outputs:** Final **Technology Evaluation Pack**.\n- **Checks:** Decision is actionable (owner, date, next actions) and reversible where possible.\n\n## Anti-patterns (common failure modes)\n\n1. **\"Shiny object\" framing** — Starting from \"we should use X\" instead of \"we need to solve Y.\" The evaluation becomes a justification exercise rather than a genuine comparison. Always start with the problem statement.\n2. **Feature-checklist scoring** — Evaluating options by counting features rather than mapping to workflows and ROI. Result: the tool with the most checkboxes wins even if it solves the wrong problem.\n3. **Skipping the status quo option** — Failing to include \"do nothing / improve current approach\" as a baseline. Result: you cannot prove the new tool is actually better than incremental improvement.\n4. **Vendor demo as evidence** — Treating a curated vendor demo as proof of production fitness. Result: the pilot discovers integration, performance, or data-quality issues that the demo hid.\n5. **No exit plan** — Adopting a tool without evaluating lock-in, data export, or migration cost. Result: switching cost grows silently until the team is trapped.\n\n## Quality gate (required)\n- Use [references/CHECKLISTS.md](references/CHECKLISTS.md) and [references/RUBRIC.md](references/RUBRIC.md).\n- Always include: **Risks**, **Open questions**, **Next steps**.\n\n## Examples\n\n**Example 1 (AI vendor):** “Use `evaluating-new-technology` to evaluate an AI ‘prompt guardrails’ vendor for our support agent. Constraints: SOC2 required, PII present, must support SSO, budget $50k/yr, decision in 3 weeks.”  \nExpected: evaluation pack that treats guardrail claims skeptically and proposes defense-in-depth + a measurable pilot.\n\n**Example 2 (analytics stack):** “Use `evaluating-new-technology` to choose between PostHog and Amplitude for our PLG product. Current stack: Segment + data warehouse; goal is faster iteration on onboarding and activation.”  \nExpected: options matrix + pilot plan tied to workflows (experiments, funnels, lifecycle triggers) and migration effort.\n\n**Boundary example (redirect — no problem defined):** “What’s the best new AI tool we should adopt?”\nResponse: Out of scope without a problem/workflow. Ask intake questions and/or propose running `problem-definition` first to identify the job-to-be-done before evaluating tools.\n\n**Boundary example (redirect — neighbor skill):** “We already picked Datadog; now help us plan the migration from our current monitoring stack.”\nResponse: This is a migration/tech-debt execution task, not an evaluation. Redirect to `managing-tech-debt` for a migration plan with milestones, rollback, and decommission steps.","tags":["evaluating","new","technology","lenny","skills","plus","liqiongyu","agent-skills","ai-agents","automation","claude","codex"],"capabilities":["skill","source-liqiongyu","skill-evaluating-new-technology","topic-agent-skills","topic-ai-agents","topic-automation","topic-claude","topic-codex","topic-prompt-engineering","topic-refoundai","topic-skillpack"],"categories":["lenny_skills_plus"],"synonyms":[],"warnings":[],"endpointUrl":"https://skills.sh/liqiongyu/lenny_skills_plus/evaluating-new-technology","protocol":"skill","transport":"skills-sh","auth":{"type":"none","details":{"cli":"npx skills add liqiongyu/lenny_skills_plus","source_repo":"https://github.com/liqiongyu/lenny_skills_plus","install_from":"skills.sh"}},"qualityScore":"0.474","qualityRationale":"deterministic score 0.47 from registry signals: · indexed on github topic:agent-skills · 49 github stars · SKILL.md body (9,321 chars)","verified":false,"liveness":"unknown","lastLivenessCheck":null,"agentReviews":{"count":0,"score_avg":null,"cost_usd_avg":null,"success_rate":null,"latency_p50_ms":null,"narrative_summary":null,"summary_updated_at":null},"enrichmentModel":"deterministic:skill-github:v1","enrichmentVersion":1,"enrichedAt":"2026-04-22T00:56:22.322Z","embedding":null,"createdAt":"2026-04-18T22:16:33.646Z","updatedAt":"2026-04-22T00:56:22.322Z","lastSeenAt":"2026-04-22T00:56:22.322Z","tsv":"'1':358,437,566,946,1117 '10':672 '100':754 '12':743 '2':312,369,492,980,1168 '3':298,313,379,465,469,548,567,1012,1148 '4':392,604,1042 '5':293,299,405,466,470,671,674,1072 '50k/yr':1145 '6':413,747 '7':423,821 '8':435,884 'action':452,506,561,618,693,766,841,895,931,935 'activ':1198 'actual':1037 'ad':650 'adopt':42,214,282,859,906,1076,1229 'adoption/rollback':421 'agent':1135 'ai':39,84,150,778,1118,1128,1225 'ai-product-strategi':149 'alreadi':223,1264 'also':22 'altern':375,559 'alway':429,916,974,1108 'amplitud':1181 'analysi':383,730,733 'analyt':646,1169 'analytics/experimentation':105 'anchor':573 'and/or':1240 'anti':941 'anti-pattern':940 'api':625 'approach':1025 'approach/stack':262 'approv':800 'architectur':195,612 'ask':290,326,1237 'assess':172 'assum':787 'assumpt':309,368 'attack':789 'avail':253 'avoid':442 'bandwidth':679 'bandwidth/tco':384,700 'baselin':1028 'becom':965 'best':1223 'better':1038 'bias':444 'blocker':819 'boundari':1214,1258 'breaker':526,533 'brief':360,477,529 'budget':272,1144 'build':14,50,91,158,164,318,380,569,675,695,702,712,727 'building-with-llm':163 'bullet':673 'buy':16,52,97,316,382,677,697,729 'call':707 'candid':244,446 'cannot':1031 'capac':688 'captur':66,524 'catch':782 'chang':57,103,330 'chat':349 'check':482,534,592,607,640,659,731,807,871,928 'checkbox':1003 'checklist':587,983 'choos':1177 'chose':224 'claim':75,411,756,784,1156 'class':684 'clear':544 'common':943 'communic':886 'compar':694 'comparison':973 'compet':386,690 'complet':892 'complex':629,656 'complianc':518,772 'confirm':340 'consid':102 'constraint':268,364,497,501,516,531,613,1136 'context':279 'core':385,689,718 'cost':388,685,692,736,1088,1091 'count':988 'cover':33 'creat':5,110 'criteria':371,376,404,553,572,590,653,837,858 'criterion':594 'curat':1049 'current':261,450,557,1024,1186,1275 'data':269,609,614,623,630,644,758,865,1065,1084,1189 'data-qu':1064 'data/control':668 'datadog':1266 'dataset':849 'date':933 'deadlin':505 'deal':525,532 'debt':211,221,236,1293 'decid':281,885 'decis':19,28,49,53,121,196,278,363,414,487,504,898,929,1146 'decommiss':1302 'defens':794,1161 'defense-in-depth':793,1160 'defer':320 'defin':493,507,571,842,1219 'definit':140,1245 'delet':866 'deliver':342 'demo':1044,1051,1070 'deploy':275,764 'depth':796,1163 'describ':662 'design':155,193,792 'determin':788 'differenti':719 'discov':1060 'doc':114 'document':830 'done':1254 'draft':475,894 'e.g':315 'effort':1213 'emerg':43 'enabl':576 'end':665,667 'end-to-end':664 'engag':175 'engin':687 'eval/red-team':802 'evalu':2,8,24,29,34,81,113,199,346,359,476,528,552,851,926,964,985,1080,1122,1126,1151,1173,1256,1287 'evaluating-new-technolog':1,1121,1172 'evaluating-trade-off':23,198 'even':1005 'everi':593 'everyth':783 'evid':67,1046 'exampl':1115,1116,1167,1215,1259 'execut':1283 'exercis':968 'exist':192,209,790 'exit':403,857,1074 'expect':1150,1199 'experi':1207 'explain':485 'explicit':308,339 'export':1085 'export/exit':633 'extra':879 'fail':1018 'failur':944 'falsifi':600 'fast':605 'faster':1193 'featur':586,982,989 'feature-checklist':981 'feel':462 'file':352 'final':924 'first':69,141,182,683,1246 'first-class':682 'first-pass':68 'fit':610,658,1056 'flow':669 'formal':170 'frame':11,949 'full':145 'funnel':1208 'gate':890,1100 'general':27 'genuin':972 'goal':367,473,481,1191 'good':494 'grow':1092 'guardrail':407,749,781,805,1130,1155 'hard':496 'help':1268 'hid':1071 'hous':95 'hub':645 'human':799 'hybrid':570 'hypothes':399,844 'identifi':619,767,1248 'immature/unacceptable':725 'impact':582 'implementation/migration':230 'improv':256,1023,1041 'in-hous':93 'includ':38,430,734,862,917,1020,1109 'increment':1040 'info':288 'infrastructur':240 'input':241,445,498,556,611,686,757,836,891 'instead':956 'intak':1238 'integr':608,616,621,655,1061 'issu':1067 'iter':860,1194 'job':1251 'job-to-be-don':1250 'justif':967 'key':768 'label':820 'lag':511 'latency/cost':521 'layer':643 'lead':510 'least':599 'ledger':701 'lifecycl':647,1209 'link':251 'list':464,562 'llm':160 'llms':166 'lock':390,775,1082 'lock-in':389,774,1081 'log':626,798 'maintain':740 'mainten':704 'make':107,541 'manag':205,219,234,710,1291 'managing-tech-debt':218,233,1290 'map':549,992 'market':723,786 'matrix':13,372,591,654,1201 'measur':596,1165 'meet':880 'memo':20,122,415,899 'method':852 'metric':401,500,509,847 'migrat':628,1087,1212,1272,1296 'migration/tech-debt':1282 'mileston':1299 'minimum':242 'miss':287,305 'missing-info':286 'mitig':412,816 'mode':945 'model':276,761,765 'monitor':1276 'month':744 'must':514,1141 'must-hav':513 'name':489 'need':143,168,203,228,959 'neighbor':1261 'new':3,30,36,83,216,1034,1123,1174,1224 'next':427,921,934,1113 'non':366,472,480 'non-goal':365,471,479 'note':378,627,649 'noth':1022 'object':948 'off':26,189,201,420,904 'on-cal':705 'onboard':1196 'one':456 'one-sent':455 'open':425,919,1111 'oper':76 'opportun':387,691,735 'option':12,250,314,370,550,563,589,652,986,1017,1200 'order':357 'output':341,474,527,588,648,726,803,868,923 'owner':813,932 'ownership':631 'pack':9,347,893,927,1152 'pain':451 'pass':70,183 'path':634 'pattern':942 'pay':207 'penetr':173 'perform':1062 'permiss':797 'pick':1265 'pii':1139 'pilot':17,64,117,393,603,828,843,869,877,1059,1166,1202 'pipelin':624 'plan':18,118,231,398,422,822,870,907,1075,1203,1270,1297 'platform':239 'platform-infrastructur':238 'plg':1184 'plg/growth':636 'point':617,801 'possibl':939 'posthog':1179 'postur':763 'prefer':711 'present':311,1140 'privaci':517,771 'privacy/compliance':271 'problem':10,139,361,441,458,478,978,1011,1218,1244 'problem-definit':138,1243 'problem/job':133 'problem/workflow':254,1236 'proceed':306 'produc':179,343 'product':40,146,151,331,1055,1185 'prompt':1129 'proof':61,395,825,1053 'proof-of-valu':60,394,824 'propos':1159,1241 'prove':1032 'qualiti':889,1066,1099 'quality-g':888 'question':294,426,920,1112,1239 'quo':374,565,1016 'rather':212,969,990 'rational':417 're':101,154 'readi':77 'real':132 'realiti':606 'recommend':109,416,901 'redirect':1216,1260,1288 'reduct':584 'references/checklists.md':909,910,1103,1104 'references/intake.md':296,297 'references/rubric.md':914,915,1106,1107 'references/templates.md':432,433 'region':274 'reject':861 'relev':523 'reliabl':773 'request':323,354 'requir':243,338,620,867,1101,1138 'resourc':855 'respons':1230,1278 'result':997,1029,1057,1089 'revenu':581 'revers':937 'review':72,408,750,806 'risk':71,119,284,406,424,502,583,748,769,804,810,838,918,1110 'roi':555,578,996 'rollback':863,1300 'run':58,328,875,908,1242 'saas/on-prem':277 'safe':755 'saniti':639 'sanity-check':638 'save':580 'say':538 'scope':32,283,400,845,1233 'score':377,912,984 'secret':324 'secur':171,176,519,770 'security/privacy/compliance':73,409 'see':21 'segment':1188 'sensit':270,759 'sentenc':457 'shini':947 'sign':334 'silent':1093 'skeptic':752,1157 'skill':178,1262 'skill-evaluating-new-technology' 'skip':834,1013 'soc2':1137 'solv':135,961,1008 'someth':215 'sourc':615 'source-liqiongyu' 'specifi':853 'sso':622,1143 'stack':56,106,558,642,657,1170,1187,1277 'stakehold':362,536,560,840 'start':438,950,975 'statement':459,979 'status':373,564,1015 'step':428,436,922,1114,1303 'still':304 'strategi':152,289 'strategy/roadmap':147 'structur':181 'success':499,508,530,846 'success/failure':881 'support':1134,1142 'switch':1090 'symptom':467 'system':161,332,742 'target':448 'task':1284 'team':88,873,1096 'tech':44,55,220,235,447,1292 'technic':210 'technolog':4,7,31,112,226,245,345,925,1124,1175 'templat':431 'test':174,848 'threat':760 'tie':1204 'time':302,579,703 'timelin':273,402,839,854 'toler':285,503 'tool':329,443,491,637,999,1035,1078,1226,1257 'tool/platform/vendor':37 'tool/vendor':85 'top':809 'topic-agent-skills' 'topic-ai-agents' 'topic-automation' 'topic-claude' 'topic-codex' 'topic-prompt-engineering' 'topic-refoundai' 'topic-skillpack' 'trade':25,188,200,419,903 'trade-off':187,418,902 'trap':1098 'treat':780,1047,1154 'trigger':1210 'unambigu':883 'upgrad':708 'uptim':520 'us':1269 'use':47,80,126,137,148,162,197,217,232,698,954,1102,1120,1171 'valu':63,397,827 'vendor':74,99,337,410,568,709,722,762,779,1043,1050,1119,1131 'vendor/build':249 'vs':15,51,317,319,381,676,696,728 'warehous':1190 'week':1149 'weigh':186 'win':1004 'within':190 'without':488,878,1079,1234 'work':267 'workflow':434,554,575,994,1206 'workflow/users':449 'would':540,739 'write':453,896 'wrong':1010 'x':955 'y':962 'yes':545 'yet':136","prices":[{"id":"0b3f108b-7687-4381-af65-373c3af473ff","listingId":"b8d50dc1-7233-44bf-8549-7018f0b90d1c","amountUsd":"0","unit":"free","nativeCurrency":null,"nativeAmount":null,"chain":null,"payTo":null,"paymentMethod":"skill-free","isPrimary":true,"details":{"org":"liqiongyu","category":"lenny_skills_plus","install_from":"skills.sh"},"createdAt":"2026-04-18T22:16:33.646Z"}],"sources":[{"listingId":"b8d50dc1-7233-44bf-8549-7018f0b90d1c","source":"github","sourceId":"liqiongyu/lenny_skills_plus/evaluating-new-technology","sourceUrl":"https://github.com/liqiongyu/lenny_skills_plus/tree/main/skills/evaluating-new-technology","isPrimary":false,"firstSeenAt":"2026-04-18T22:16:33.646Z","lastSeenAt":"2026-04-22T00:56:22.322Z"}],"details":{"listingId":"b8d50dc1-7233-44bf-8549-7018f0b90d1c","quickStartSnippet":null,"exampleRequest":null,"exampleResponse":null,"schema":null,"openapiUrl":null,"agentsTxtUrl":null,"citations":[],"useCases":[],"bestFor":[],"notFor":[],"kindDetails":{"org":"liqiongyu","slug":"evaluating-new-technology","github":{"repo":"liqiongyu/lenny_skills_plus","stars":49,"topics":["agent-skills","ai-agents","automation","claude","codex","prompt-engineering","refoundai","skillpack"],"license":"apache-2.0","html_url":"https://github.com/liqiongyu/lenny_skills_plus","pushed_at":"2026-04-04T06:30:11Z","description":"86 agent-executable skill packs converted from RefoundAI’s Lenny skills (unofficial). Works with Codex + Claude Code.","skill_md_sha":"6daf3ede3bf2d6f406043b7e9d5a3245573558af","skill_md_path":"skills/evaluating-new-technology/SKILL.md","default_branch":"main","skill_tree_url":"https://github.com/liqiongyu/lenny_skills_plus/tree/main/skills/evaluating-new-technology"},"layout":"multi","source":"github","category":"lenny_skills_plus","frontmatter":{"name":"evaluating-new-technology","description":"Create a Technology Evaluation Pack (problem framing, options matrix, build vs buy, pilot plan, decision memo). See also: evaluating-trade-offs (general decisions)."},"skills_sh_url":"https://skills.sh/liqiongyu/lenny_skills_plus/evaluating-new-technology"},"updatedAt":"2026-04-22T00:56:22.322Z"}}