{"id":"24a3cd9a-b027-478a-8ec6-1006efec8b5f","shortId":"3mEtVc","kind":"skill","title":"technical-pm","tagline":"Structured technical PM framework for AI product roles. Covers: RLHF, evals, RAG, LLM deployment, system design, API design.","description":"# Technical PM Skill\n\nApply a structured framework to technical PM questions targeting AI product roles.\n\n## When to Use\n- User asks about RLHF, fine-tuning, evals, inference, model architecture\n- User asks \"Design a system that uses LLMs to X\"\n- User asks \"How would you build a RAG system for X\"\n- User asks about technical trade-offs in AI/ML systems\n- User asks about API design for AI products\n- User says `/technical-pm` followed by a question\n- Any question requiring ML/AI technical depth from a PM perspective\n\n## Context\n- **Tuned for**: AI product roles at frontier AI companies\n- **What matters**: Going deep with researchers and engineers. You don't need to implement, but you need to understand the technical landscape well enough to make informed product decisions.\n- **Common pitfall**: Hand-waving on technical details. Be specific about architectures, trade-offs, and constraints.\n\n## Framework: AI PM Technical Method (6 Sections)\n\n### Section 1: Technical Clarifications & Constraints\nBefore designing anything, scope the technical problem:\n- **Capability Assumptions**: What model capabilities are available? (reasoning, multimodal, tool use, code gen)\n- **Scale**: How many users/queries? What latency requirements?\n- **Infrastructure**: Cloud vs. on-prem? What compute budget?\n- **Data**: What training/eval data exists? Privacy constraints?\n- **Integration**: What systems does this need to plug into?\n- **Timeline**: MVP vs. production-grade?\n\n### Section 2: Users (Developer & End-User Personas)\nFor technical products, think about two user layers:\n- **Developers/Engineers**: Who builds on this? What's their skill level? What do they expect?\n- **End Users**: Who consumes the output? What quality bar do they need?\n\nFor each persona: current workflow, technical sophistication, key frustrations.\n\n### Section 3: High-Level System Design\nDraw the system architecture (describe it clearly):\n- **Data Pipeline**: How does data flow in? (user input → preprocessing → model → postprocessing → output)\n- **Model Layer**: Which model(s)? Foundation model + fine-tuned? Routing? Ensemble?\n- **Orchestration**: How are multi-step workflows managed? (agents, chains, state machines)\n- **Storage**: What needs to be persisted? (conversation history, embeddings, user preferences, model artifacts)\n- **Serving**: How is inference served? (batch vs. real-time, edge vs. cloud)\n\n**For RAG systems specifically:**\n- Document ingestion pipeline (chunking strategy, embedding model, vector DB)\n- Retrieval (similarity search, reranking, hybrid search)\n- Generation (context window management, prompt engineering, citation)\n- Evaluation (relevance, faithfulness, answer quality)\n\n**For Agent systems specifically:**\n- Tool/function calling architecture\n- Planning and reasoning loop\n- Memory (short-term working memory vs. long-term)\n- Safety/sandboxing (what can the agent actually do?)\n\n### Section 4: Deep Dive & Trade-offs\nThe interviewer will pick an area to go deep. Be prepared for:\n\n**The Latency-Cost-Quality Triangle:**\nEvery AI system has this fundamental trade-off:\n- **Latency** <-> **Quality**: Faster responses = less reasoning time, fewer model calls\n- **Cost** <-> **Quality**: Cheaper inference = smaller models, less compute per query\n- **Latency** <-> **Cost**: Real-time serving = more provisioned capacity, higher cost\n\nDiscuss specific techniques for each trade-off:\n- Latency: Streaming, caching, speculative decoding, model distillation, edge deployment\n- Cost: Batching, model routing (small model for easy queries, large for hard), quantization, spot instances\n- Quality: Chain-of-thought, self-consistency, retrieval augmentation, fine-tuning, human-in-the-loop\n\n**RLHF Pipeline** (know this end-to-end):\n1. Supervised Fine-Tuning (SFT) on high-quality demonstrations\n2. Reward Model training from human preference comparisons\n3. PPO optimization against the reward model with KL penalty\n4. RLHF alternatives: DPO (Direct Preference Optimization), RLAIF, Constitutional AI\n\n**Evals** (increasingly critical for AI PMs):\n- **What to eval**: Accuracy, safety, instruction-following, hallucination, code correctness\n- **How to eval**: Human eval, LLM-as-judge, automated benchmarks, A/B testing in production\n- **Eval pitfalls**: Benchmark contamination, Goodhart's law, distributional shift\n- **Building eval sets**: Golden datasets, adversarial examples, edge cases, domain-specific\n\n**Context Windows & Memory:**\n- Trade-offs of larger context: Cost (quadratic attention), latency, lost-in-the-middle\n- Strategies: Summarization, RAG, hierarchical memory, sliding window\n- When to use fine-tuning vs. in-context learning vs. RAG\n\n**Hallucination Detection & Mitigation:**\n- Detection: Confidence calibration, self-consistency checks, retrieval verification, citation validation\n- Mitigation: Grounding in retrieved facts, chain-of-thought transparency, abstention (model says \"I don't know\")\n- Measurement: Factual accuracy benchmarks, human annotation, automated fact-checking\n\n### Section 5: API Design & Developer Experience\nFor platform/API products, design the interface:\n- **API surface**: REST vs. streaming vs. SDK. Key endpoints.\n- **Developer journey**: Sign up → first API call → production integration\n- **Documentation**: What developers need to succeed\n- **Pricing**: Per-token, per-request, tiered, seat-based\n- **Rate limiting & quotas**: Fair usage, abuse prevention\n- **Versioning**: How to ship improvements without breaking existing users\n\n### Section 6: Metrics (Technical + Product)\n**Technical metrics:**\n- Time to First Token (TTFT)\n- Tokens Per Second (TPS)\n- Error rate (4xx, 5xx, timeout)\n- Cost per 1K tokens (input/output)\n- Model accuracy on eval suite\n- Hallucination rate\n- Safety violation rate\n\n**Product metrics:**\n- Developer activation (first API call within 7 days)\n- API adoption (monthly active developers, production integrations)\n- Quality satisfaction (developer NPS, support ticket volume)\n- Revenue (API spend, conversion to paid tiers)\n\n## Key Technical Topics to Know\n\n### Transformers & Attention\n- Self-attention mechanism, positional encoding\n- Scaling laws (Chinchilla, compute-optimal training)\n- Multi-head attention, KV cache\n\n### Training Pipeline\n- Pre-training (next token prediction on massive corpus)\n- Supervised Fine-Tuning (SFT)\n- RLHF / DPO / Constitutional AI\n- Mixture of Experts (MoE) architectures\n\n### Inference Optimization\n- Quantization (INT8, INT4, GPTQ, AWQ)\n- Speculative decoding\n- KV cache optimization\n- Batching strategies (continuous batching)\n- Model distillation (larger → smaller model)\n\n### Safety & Alignment\n- Constitutional AI\n- Red teaming and adversarial testing\n- Content filtering and classifiers\n- Responsible scaling policies\n\n### Multimodal\n- Vision-language models (image understanding)\n- Speech/audio models\n- Video understanding\n- Cross-modal retrieval\n\n## Output Format\nStructure as a technical walkthrough. Be technical but accessible — translate between researchers, engineers, and product. Whiteboard-style system diagrams described in text. Aim for ~2500 words.\n\n## Research-First Workflow\nBefore generating the answer:\n1. **Research** — Use web search to find latest technical thinking from AI leaders, engineering blogs from major AI labs, papers, benchmarks. Do 5-10 searches.\n2. **Cite sources** — Include `[linked source](url)` inline for technical claims and architecture decisions.\n3. **Display** the complete structured answer.\n\n## What Good Looks Like\n- Starts with technical scoping questions (constraints, scale, data)\n- System design is coherent and production-aware (not just academic)\n- Understands the Latency-Cost-Quality triangle deeply\n- Can explain RLHF, evals, RAG without hand-waving\n- Shows awareness of what's hard (hallucination, eval, safety)\n- Trade-off analysis is specific and quantitative\n- Connects technical decisions back to user/product impact","tags":["technical","skills","aroyburman-codes","agent-skills","claude-code","claude-skills","frameworks","metrics","pm-tools","product-management","product-strategy"],"capabilities":["skill","source-aroyburman-codes","skill-technical-pm","topic-agent-skills","topic-claude-code","topic-claude-skills","topic-frameworks","topic-metrics","topic-pm-tools","topic-product-management","topic-product-strategy"],"categories":["pm-skills"],"synonyms":[],"warnings":[],"endpointUrl":"https://skills.sh/aroyburman-codes/pm-skills/technical-pm","protocol":"skill","transport":"skills-sh","auth":{"type":"none","details":{"cli":"npx skills add aroyburman-codes/pm-skills","source_repo":"https://github.com/aroyburman-codes/pm-skills","install_from":"skills.sh"}},"qualityScore":"0.453","qualityRationale":"deterministic score 0.45 from registry signals: · indexed on github topic:agent-skills · 6 github stars · SKILL.md body (7,915 chars)","verified":false,"liveness":"unknown","lastLivenessCheck":null,"agentReviews":{"count":0,"score_avg":null,"cost_usd_avg":null,"success_rate":null,"latency_p50_ms":null,"narrative_summary":null,"summary_updated_at":null},"enrichmentModel":"deterministic:skill-github:v1","enrichmentVersion":1,"enrichedAt":"2026-05-18T19:14:48.206Z","embedding":null,"createdAt":"2026-05-18T13:22:17.409Z","updatedAt":"2026-05-18T19:14:48.206Z","lastSeenAt":"2026-05-18T19:14:48.206Z","tsv":"'-10':1007 '/technical-pm':92 '1':171,543,984 '1k':800 '2':234,554,1009 '2500':974 '3':285,562,1023 '4':421,572 '4xx':795 '5':715,1006 '5xx':796 '6':168,778 '7':821 'a/b':610 'abstent':697 'abus':766 'academ':1051 'access':957 'accuraci':591,706,804 'activ':816,826 'actual':418 'adopt':824 'adversari':628,923 'agent':331,393,417 'ai':9,34,88,110,115,164,446,581,586,889,919,995,1001 'ai/ml':80 'aim':972 'align':917 'altern':574 'analysi':1081 'annot':709 'answer':390,983,1028 'anyth':177 'api':20,85,716,726,740,818,823,838 'appli':25 'architectur':50,157,294,398,894,1021 'area':432 'artifact':347 'ask':41,52,62,73,83 'assumpt':183 'attent':646,850,853,867 'augment':526 'autom':608,710 'avail':188 'awar':1048,1070 'awq':901 'back':1089 'bar':271 'base':760 'batch':353,503,907,910 'benchmark':609,616,707,1004 'blog':998 'break':774 'budget':210 'build':66,251,623 'cach':495,869,905 'calibr':678 'call':397,463,741,819 'capabl':182,186 'capac':482 'case':631 'chain':332,519,693 'chain-of-thought':518,692 'cheaper':466 'check':682,713 'chinchilla':859 'chunk':368 'citat':386,685 'cite':1010 'claim':1019 'clarif':173 'classifi':928 'clear':297 'cloud':203,360 'code':193,597 'coher':1044 'common':146 'compani':116 'comparison':561 'complet':1026 'comput':209,471,861 'compute-optim':860 'confid':677 'connect':1086 'consist':524,681 'constitut':580,888,918 'constraint':162,174,217,1038 'consum':266 'contamin':617 'content':925 'context':107,381,635,643,669 'continu':909 'convers':341,840 'corpus':880 'correct':598 'cost':442,464,475,484,502,644,798,1056 'cover':12 'critic':584 'cross':944 'cross-mod':943 'current':278 'data':211,214,298,302,1040 'dataset':627 'day':822 'db':373 'decis':145,1022,1088 'decod':497,903 'deep':120,422,435 'deepli':1059 'demonstr':553 'deploy':17,501 'depth':102 'describ':295,969 'design':19,21,53,86,176,290,717,723,1042 'detail':153 'detect':674,676 'develop':236,718,735,746,815,827,832 'developers/engineers':249 'diagram':968 'direct':576 'discuss':485 'display':1024 'distil':499,912 'distribut':621 'dive':423 'document':365,744 'domain':633 'domain-specif':632 'dpo':575,887 'draw':291 'easi':509 'edg':358,500,630 'embed':343,370 'encod':856 'end':238,263,540,542 'end-to-end':539 'end-us':237 'endpoint':734 'engin':124,385,961,997 'enough':140 'ensembl':322 'error':793 'eval':14,47,582,590,601,603,614,624,806,1063,1076 'evalu':387 'everi':445 'exampl':629 'exist':215,775 'expect':262 'experi':719 'expert':892 'explain':1061 'fact':691,712 'fact-check':711 'factual':705 'fair':764 'faith':389 'faster':456 'fewer':461 'filter':926 'find':990 'fine':45,319,528,546,664,883 'fine-tun':44,318,527,545,663,882 'first':739,786,817,978 'flow':303 'follow':93,595 'format':948 'foundat':316 'framework':7,28,163 'frontier':114 'frustrat':283 'fundament':450 'gen':194 'generat':380,981 'go':119,434 'golden':626 'good':1030 'goodhart':618 'gptq':900 'grade':232 'ground':688 'hallucin':596,673,808,1075 'hand':149,1067 'hand-wav':148,1066 'hard':513,1074 'head':866 'hierarch':656 'high':287,551 'high-level':286 'high-qual':550 'higher':483 'histori':342 'human':531,559,602,708 'human-in-the-loop':530 'hybrid':378 'imag':937 'impact':1092 'implement':130 'improv':772 'in-context':667 'includ':1012 'increas':583 'infer':48,351,467,895 'inform':143 'infrastructur':202 'ingest':366 'inlin':1016 'input':306 'input/output':802 'instanc':516 'instruct':594 'instruction-follow':593 'int4':899 'int8':898 'integr':218,743,829 'interfac':725 'interview':428 'journey':736 'judg':607 'key':282,733,844 'kl':570 'know':537,703,848 'kv':868,904 'lab':1002 'landscap':138 'languag':935 'larg':511 'larger':642,913 'latenc':200,441,454,474,493,647,1055 'latency-cost-qu':440,1054 'latest':991 'law':620,858 'layer':248,312 'leader':996 'learn':670 'less':458,470 'level':258,288 'like':1032 'limit':762 'link':1013 'llm':16,605 'llm-as-judg':604 'llms':58 'long':411 'long-term':410 'look':1031 'loop':402,534 'lost':649 'lost-in-the-middl':648 'machin':334 'major':1000 'make':142 'manag':330,383 'mani':197 'massiv':879 'matter':118 'measur':704 'mechan':854 'memori':403,408,637,657 'method':167 'metric':779,783,814 'middl':652 'mitig':675,687 'mixtur':890 'ml/ai':100 'modal':945 'model':49,185,308,311,314,317,346,371,462,469,498,504,507,556,568,698,803,911,915,936,940 'moe':893 'month':825 'multi':327,865 'multi-head':864 'multi-step':326 'multimod':190,932 'mvp':228 'need':128,133,223,274,337,747 'next':875 'nps':833 'off':78,160,426,640 'on-prem':205 'optim':564,578,862,896,906 'orchestr':323 'output':268,310,947 'paid':842 'paper':1003 'penalti':571 'per':472,752,755,790,799 'per-request':754 'per-token':751 'persist':340 'persona':240,277 'perspect':106 'pick':430 'pipelin':299,367,536,871 'pitfal':147,615 'plan':399 'platform/api':721 'plug':225 'pm':3,6,23,31,105,165 'pms':587 'polici':931 'posit':855 'postprocess':309 'ppo':563 'pre':873 'pre-train':872 'predict':877 'prefer':345,560,577 'prem':207 'prepar':437 'preprocess':307 'prevent':767 'price':750 'privaci':216 'problem':181 'product':10,35,89,111,144,231,243,613,722,742,781,813,828,963,1047 'production-awar':1046 'production-grad':230 'prompt':384 'provis':481 'quadrat':645 'qualiti':270,391,443,455,465,517,552,830,1057 'quantit':1085 'quantiz':514,897 'queri':473,510 'question':32,96,98,1037 'quota':763 'rag':15,68,362,655,672,1064 'rate':761,794,809,812 'real':356,477 'real-tim':355,476 'reason':189,401,459 'red':920 'relev':388 'request':756 'requir':99,201 'rerank':377 'research':122,960,977,985 'research-first':976 'respons':457,929 'rest':728 'retriev':374,525,683,690,946 'revenu':837 'reward':555,567 'rlaif':579 'rlhf':13,43,535,573,886,1062 'role':11,36,112 'rout':321,505 'safeti':592,810,916,1077 'safety/sandboxing':413 'satisfact':831 'say':91,699 'scale':195,857,930,1039 'scope':178,1036 'sdk':732 'search':376,379,988,1008 'seat':759 'seat-bas':758 'second':791 'section':169,170,233,284,420,714,777 'self':523,680,852 'self-attent':851 'self-consist':522,679 'serv':348,352,479 'set':625 'sft':548,885 'shift':622 'ship':771 'short':405 'short-term':404 'show':1069 'sign':737 'similar':375 'skill':24,257 'skill-technical-pm' 'slide':658 'small':506 'smaller':468,914 'sophist':281 'sourc':1011,1014 'source-aroyburman-codes' 'specif':155,364,395,486,634,1083 'specul':496,902 'speech/audio':939 'spend':839 'spot':515 'start':1033 'state':333 'step':328 'storag':335 'strategi':369,653,908 'stream':494,730 'structur':4,27,949,1027 'style':966 'succeed':749 'suit':807 'summar':654 'supervis':544,881 'support':834 'surfac':727 'system':18,55,69,81,220,289,293,363,394,447,967,1041 'target':33 'team':921 'technic':2,5,22,30,75,101,137,152,166,172,180,242,280,780,782,845,952,955,992,1018,1035,1087 'technical-pm':1 'techniqu':487 'term':406,412 'test':611,924 'text':971 'think':244,993 'thought':521,695 'ticket':835 'tier':757,843 'time':357,460,478,784 'timelin':227 'timeout':797 'token':753,787,789,801,876 'tool':191 'tool/function':396 'topic':846 'topic-agent-skills' 'topic-claude-code' 'topic-claude-skills' 'topic-frameworks' 'topic-metrics' 'topic-pm-tools' 'topic-product-management' 'topic-product-strategy' 'tps':792 'trade':77,159,425,452,491,639,1079 'trade-off':76,158,424,451,490,638,1078 'train':557,863,870,874 'training/eval':213 'transform':849 'translat':958 'transpar':696 'triangl':444,1058 'ttft':788 'tune':46,108,320,529,547,665,884 'two':246 'understand':135,938,942,1052 'url':1015 'usag':765 'use':39,57,192,662,986 'user':40,51,61,72,82,90,235,239,247,264,305,344,776 'user/product':1091 'users/queries':198 'valid':686 'vector':372 'verif':684 'version':768 'video':941 'violat':811 'vision':934 'vision-languag':933 'volum':836 'vs':204,229,354,359,409,666,671,729,731 'walkthrough':953 'wave':150,1068 'web':987 'well':139 'whiteboard':965 'whiteboard-styl':964 'window':382,636,659 'within':820 'without':773,1065 'word':975 'work':407 'workflow':279,329,979 'would':64 'x':60,71","prices":[{"id":"51f7f8fc-e770-41b9-b33a-0d5e45c5b9cc","listingId":"24a3cd9a-b027-478a-8ec6-1006efec8b5f","amountUsd":"0","unit":"free","nativeCurrency":null,"nativeAmount":null,"chain":null,"payTo":null,"paymentMethod":"skill-free","isPrimary":true,"details":{"org":"aroyburman-codes","category":"pm-skills","install_from":"skills.sh"},"createdAt":"2026-05-18T13:22:17.409Z"}],"sources":[{"listingId":"24a3cd9a-b027-478a-8ec6-1006efec8b5f","source":"github","sourceId":"aroyburman-codes/pm-skills/technical-pm","sourceUrl":"https://github.com/aroyburman-codes/pm-skills/tree/main/skills/technical-pm","isPrimary":false,"firstSeenAt":"2026-05-18T13:22:17.409Z","lastSeenAt":"2026-05-18T19:14:48.206Z"}],"details":{"listingId":"24a3cd9a-b027-478a-8ec6-1006efec8b5f","quickStartSnippet":null,"exampleRequest":null,"exampleResponse":null,"schema":null,"openapiUrl":null,"agentsTxtUrl":null,"citations":[],"useCases":[],"bestFor":[],"notFor":[],"kindDetails":{"org":"aroyburman-codes","slug":"technical-pm","github":{"repo":"aroyburman-codes/pm-skills","stars":6,"topics":["agent-skills","ai","claude-code","claude-skills","frameworks","metrics","pm-tools","product-management","product-strategy"],"license":"mit","html_url":"https://github.com/aroyburman-codes/pm-skills","pushed_at":"2026-02-17T06:52:03Z","description":"PM workflow and product thinking skills for AI product managers. 17 structured frameworks for PRDs, metrics, strategy, writing, prioritization, and more.","skill_md_sha":"473901129e94c80a1c06f287ea716cbd1322990d","skill_md_path":"skills/technical-pm/SKILL.md","default_branch":"main","skill_tree_url":"https://github.com/aroyburman-codes/pm-skills/tree/main/skills/technical-pm"},"layout":"multi","source":"github","category":"pm-skills","frontmatter":{"name":"technical-pm","description":"Structured technical PM framework for AI product roles. Covers: RLHF, evals, RAG, LLM deployment, system design, API design."},"skills_sh_url":"https://skills.sh/aroyburman-codes/pm-skills/technical-pm"},"updatedAt":"2026-05-18T19:14:48.206Z"}}