{"id":"40c1f5a0-abe3-4a75-8a32-6934047cb7c0","shortId":"rZU53y","kind":"skill","title":"review-and-promote-traces","tagline":"Judge flagged trace outputs and promote judged items back to datasets. Use when an evaluation run requires human judgment or when review queue items need to be judged for promotion into the dataset.","description":"# Review and Promote Traces\n\nUse this skill for the judgment and promotion loop after trace evaluation.\n\n## Interactive Q&A protocol (mandatory)\n\n<HARD-GATE>\nBEFORE the first scoping question, search for a structured question tool (e.g., `AskUserQuestion` or similar interactive widget) and load it. Use that tool for EVERY scoping question. Fall back to plain-text lettered options ONLY if no such tool exists in the environment.\n</HARD-GATE>\n\nIf context is unclear, ask one question at a time using the structured question tool (loaded per the HARD-GATE above).\n\nExample question structure:\n\n```\nWhat do you want to review first?\nA) One specific run\nB) Pending queue triage across runs\n```\n\nRules:\n- Ask one question per message.\n- Use the structured question tool for every question. Structure each with a short header, 2-4 options with labels and descriptions, and place the recommended option first. Do not add \"(Recommended)\" or similar annotations to option labels.\n- Ask one follow-up when needed, then continue.\n\n## Workflow\n\n1. If needed, flag evaluated run:\n   - Use `flag_review_item` with `run_id`.\n2. Retrieve review items:\n   - Use `list_review_items` with pagination.\n3. Collect human judgments:\n   - Present review items to the user and ask for judgment one item at a time when needed.\n   - Capture `judgment_value` and optional `notes` for each item.\n   - Use the structured question tool (loaded per the HARD-GATE above) to present options that match the live evaluation setup:\n     - Binary example:\n       - A) Pass\n       - B) Fail\n     - Categorical example:\n       - A) <option 1>\n       - B) <option 2>\n       - C) <option 3>\n     - Continuous example:\n       - A) Enter numeric value (within configured range)\n       - B) Skip this item for now\n   - If valid options are unclear, look them up before asking:\n     1) Inspect `list_review_items` payload for item result details and expected value hints.\n     2) Use `get_result(run_id)` for run-level context and chain outputs.\n     3) If evaluation id is available in run metadata, call `get_evaluation(evaluation_id)` and read config/judgment criteria to derive valid judgment options.\n   - When options remain ambiguous after lookup, ask one clarification question using the structured question tool before proceeding.\n4. Submit judgments:\n   - For each target item, call `judge_review_item`.\n   - Pass the user-provided `judgment_value` and optional `notes` into `judge_review_item`.\n   - Include notes when judgment context matters.\n5. Promote judged outputs:\n   - Use `add_reviewed_items_to_dataset` with `run_id`.\n6. Report result:\n   - number of items judged\n   - promotion status and row counts\n   - any skipped or blocked items\n\n## Queue-wide triage guidance\n\n- Group by run and status first.\n- Prioritize high-impact runs or oldest pending runs.\n- Keep an audit trail of judgment rationale in notes.\n\n## Scopes reference\n\n- `list_review_items` requires `review:read`\n- `flag_review_item`, `judge_review_item`, and `add_reviewed_items_to_dataset` require `review:write`\n\nIf a scope error occurs, ask the user to create a key with the missing scope in Truesight Settings.","tags":["review","and","promote","traces","truesight","mcp","skills","goodeye-labs","agent-skills","ai-evaluation","chatgpt","claude"],"capabilities":["skill","source-goodeye-labs","skill-review-and-promote-traces","topic-agent-skills","topic-ai-evaluation","topic-chatgpt","topic-claude","topic-cursor","topic-llm","topic-mcp","topic-truesight","topic-vscode","topic-windsurf"],"categories":["truesight-mcp-skills"],"synonyms":[],"warnings":[],"endpointUrl":"https://skills.sh/Goodeye-Labs/truesight-mcp-skills/review-and-promote-traces","protocol":"skill","transport":"skills-sh","auth":{"type":"none","details":{"cli":"npx skills add Goodeye-Labs/truesight-mcp-skills","source_repo":"https://github.com/Goodeye-Labs/truesight-mcp-skills","install_from":"skills.sh"}},"qualityScore":"0.453","qualityRationale":"deterministic score 0.45 from registry signals: · indexed on github topic:agent-skills · 6 github stars · SKILL.md body (3,154 chars)","verified":false,"liveness":"unknown","lastLivenessCheck":null,"agentReviews":{"count":0,"score_avg":null,"cost_usd_avg":null,"success_rate":null,"latency_p50_ms":null,"narrative_summary":null,"summary_updated_at":null},"enrichmentModel":"deterministic:skill-github:v1","enrichmentVersion":1,"enrichedAt":"2026-05-18T13:22:57.533Z","embedding":null,"createdAt":"2026-05-18T13:22:57.533Z","updatedAt":"2026-05-18T13:22:57.533Z","lastSeenAt":"2026-05-18T13:22:57.533Z","tsv":"'-4':167 '1':199,309 '2':166,212,323 '3':222,337 '4':377 '5':408 '6':421 'across':144 'add':181,413,482 'ambigu':363 'annot':185 'ask':108,147,189,233,308,366,495 'askuserquest':72 'audit':460 'avail':342 'b':140,277,282,293 'back':14,88 'binari':273 'block':436 'c':283 'call':346,384 'captur':243 'categor':279 'chain':335 'clarif':368 'collect':223 'config/judgment':353 'configur':291 'context':105,333,406 'continu':197,284 'count':432 'creat':499 'criteria':354 'dataset':16,38,417,486 'deriv':356 'descript':172 'detail':318 'e.g':71 'enter':287 'environ':103 'error':493 'evalu':20,54,203,271,339,348,349 'everi':84,158 'exampl':126,274,280,285 'exist':100 'expect':320 'fail':278 'fall':87 'first':62,135,178,448 'flag':7,202,206,475 'follow':192 'follow-up':191 'gate':124,262 'get':325,347 'group':443 'guidanc':442 'hard':123,261 'hard-gat':122,260 'header':165 'high':451 'high-impact':450 'hint':322 'human':23,224 'id':211,328,340,350,420 'impact':452 'includ':402 'inspect':310 'interact':55,75 'item':13,29,208,215,219,228,237,251,296,313,316,383,387,401,415,426,437,471,477,480,484 'judg':6,12,33,385,399,410,427,478 'judgment':24,48,225,235,244,358,379,393,405,463 'keep':458 'key':501 'label':170,188 'letter':93 'level':332 'list':217,311,469 'live':270 'load':78,119,257 'look':304 'lookup':365 'loop':51 'mandatori':59 'match':268 'matter':407 'messag':151 'metadata':345 'miss':504 'need':30,195,201,242 'note':248,397,403,466 'number':424 'numer':288 'occur':494 'oldest':455 'one':109,137,148,190,236,367 'option':94,168,177,187,247,266,301,359,361,396 'output':9,336,411 'pagin':221 'pass':276,388 'payload':314 'pend':141,456 'per':120,150,258 'place':174 'plain':91 'plain-text':90 'present':226,265 'priorit':449 'proceed':376 'promot':4,11,35,41,50,409,428 'protocol':58 'provid':392 'q':56 'question':64,69,86,110,117,127,149,155,159,255,369,373 'queue':28,142,439 'queue-wid':438 'rang':292 'rational':464 'read':352,474 'recommend':176,182 'refer':468 'remain':362 'report':422 'requir':22,472,487 'result':317,326,423 'retriev':213 'review':2,27,39,134,207,214,218,227,312,386,400,414,470,473,476,479,483,488 'review-and-promote-trac':1 'row':431 'rule':146 'run':21,139,145,204,210,327,331,344,419,445,453,457 'run-level':330 'scope':63,85,467,492,505 'search':65 'set':508 'setup':272 'short':164 'similar':74,184 'skill':45 'skill-review-and-promote-traces' 'skip':294,434 'source-goodeye-labs' 'specif':138 'status':429,447 'structur':68,116,128,154,160,254,372 'submit':378 'target':382 'text':92 'time':113,240 'tool':70,82,99,118,156,256,374 'topic-agent-skills' 'topic-ai-evaluation' 'topic-chatgpt' 'topic-claude' 'topic-cursor' 'topic-llm' 'topic-mcp' 'topic-truesight' 'topic-vscode' 'topic-windsurf' 'trace':5,8,42,53 'trail':461 'triag':143,441 'truesight':507 'unclear':107,303 'use':17,43,80,114,152,205,216,252,324,370,412 'user':231,391,497 'user-provid':390 'valid':300,357 'valu':245,289,321,394 'want':132 'wide':440 'widget':76 'within':290 'workflow':198 'write':489","prices":[{"id":"61ff9036-5198-458a-8a35-9ed5ef2c467e","listingId":"40c1f5a0-abe3-4a75-8a32-6934047cb7c0","amountUsd":"0","unit":"free","nativeCurrency":null,"nativeAmount":null,"chain":null,"payTo":null,"paymentMethod":"skill-free","isPrimary":true,"details":{"org":"Goodeye-Labs","category":"truesight-mcp-skills","install_from":"skills.sh"},"createdAt":"2026-05-18T13:22:57.533Z"}],"sources":[{"listingId":"40c1f5a0-abe3-4a75-8a32-6934047cb7c0","source":"github","sourceId":"Goodeye-Labs/truesight-mcp-skills/review-and-promote-traces","sourceUrl":"https://github.com/Goodeye-Labs/truesight-mcp-skills/tree/main/skills/review-and-promote-traces","isPrimary":false,"firstSeenAt":"2026-05-18T13:22:57.533Z","lastSeenAt":"2026-05-18T13:22:57.533Z"}],"details":{"listingId":"40c1f5a0-abe3-4a75-8a32-6934047cb7c0","quickStartSnippet":null,"exampleRequest":null,"exampleResponse":null,"schema":null,"openapiUrl":null,"agentsTxtUrl":null,"citations":[],"useCases":[],"bestFor":[],"notFor":[],"kindDetails":{"org":"Goodeye-Labs","slug":"review-and-promote-traces","github":{"repo":"Goodeye-Labs/truesight-mcp-skills","stars":6,"topics":["agent-skills","ai-evaluation","chatgpt","claude","cursor","llm","mcp","truesight","vscode","windsurf"],"license":"mit","html_url":"https://github.com/Goodeye-Labs/truesight-mcp-skills","pushed_at":"2026-03-26T06:15:56Z","description":"Agent skills for the Truesight MCP. Step-by-step workflow playbooks for scoring inputs, building live evaluations, error analysis, and the review loop. Works with Claude Code, Cursor, ChatGPT, VS Code, Windsurf, and any client that supports the agent skills standard.","skill_md_sha":"63dd10037e423fd0bb5b771a4ea7508e734ab802","skill_md_path":"skills/review-and-promote-traces/SKILL.md","default_branch":"main","skill_tree_url":"https://github.com/Goodeye-Labs/truesight-mcp-skills/tree/main/skills/review-and-promote-traces"},"layout":"multi","source":"github","category":"truesight-mcp-skills","frontmatter":{"name":"review-and-promote-traces","description":"Judge flagged trace outputs and promote judged items back to datasets. Use when an evaluation run requires human judgment or when review queue items need to be judged for promotion into the dataset."},"skills_sh_url":"https://skills.sh/Goodeye-Labs/truesight-mcp-skills/review-and-promote-traces"},"updatedAt":"2026-05-18T13:22:57.533Z"}}