{"id":"46549551-c332-4dc7-a335-4af91541edab","shortId":"LH7qG5","kind":"skill","title":"ds-eval","tagline":"Internal development tool that tests whether skill descriptions trigger correctly for different user inputs. Reads test cases from a YAML file and evaluates each one by matching the input against all skill descriptions. Use when the user says \"run triggering eval\", \"test skill de","description":"# Triggering accuracy eval (ds-eval)\n\nYou are a QA evaluator for Claude Code skill descriptions. Your job is to\ndetermine whether the right skill would trigger for a given user input,\nbased solely on the description field in each skill's frontmatter.\n\n## Process\n\n### Step 1 — Load test cases and descriptions\n\nRead the test file:\n!`cat \"${CLAUDE_SKILL_DIR}/eval/triggering-tests.yaml\" 2>/dev/null || echo \"No test file found.\"`\n\nRead all skill descriptions by loading each SKILL.md frontmatter from\nthe sibling skill directories. Extract only the `name` and `description`\nfields from each.\n\nIf the user passed a filter as argument, only run tests for: $ARGUMENTS\n\n### Step 2 — Evaluate each test case\n\nFor each test case in the YAML file:\n\n1. Read the `input` phrase\n2. Compare it against ALL skill descriptions\n3. Determine which skill's description is the **best match** for that input\n4. Check:\n   - Does the best match equal `expected_skill`? → PASS\n   - Does the best match appear in `should_not_trigger`? → FAIL\n   - Is it ambiguous (two descriptions match equally well)? → AMBIGUOUS\n\n**Matching criteria** — A description \"matches\" an input when:\n- The input contains words or phrases explicitly listed in the description\n- The input's intent aligns with the skill's stated purpose\n- The description uses \"when the user says\" followed by a phrase that\n  semantically matches the input\n\n**Do NOT match based on:**\n- General topic overlap (e.g., \"organic\" doesn't auto-match all SEO skills)\n- The body of the SKILL.md — only the description field matters for triggering\n\n### Step 3 — Report results\n\nPresent results in this format:\n\n---\n\n### Triggering eval results — [date]\n\n**Summary:** X/Y passed | Z failed | W ambiguous\n\n---\n\n#### Passes\n\n| Input | Expected | Matched | Result |\n|-------|----------|---------|--------|\n| ...   | ...      | ...     | PASS   |\n\n#### Failures\n\nFor each failure, explain:\n- What input was tested\n- Which skill was expected\n- Which skill matched instead (and why)\n- Suggested description edit to fix the mismatch\n\n#### Ambiguous cases\n\nFor each ambiguous case:\n- Which two skills competed\n- Why both descriptions match\n- Suggested edit to disambiguate\n\n---\n\n### Step 4 — Suggest improvements\n\nIf any failures or ambiguous cases exist, write specific description\nedits that would fix them. Show the exact text to add or remove from\neach affected description.\n\n## Rules\n\n- Only evaluate based on the `description` frontmatter field, not the\n  full body of the SKILL.md.\n- Be strict: if a phrase is not in the description (or semantically\n  very close to one), it should not count as a match.\n- When two descriptions both match, mark as AMBIGUOUS rather than\n  picking one — the goal is to find overlap.\n- Write in the same language the user is using.","tags":["eval","marketing","skills","dataslayer-ai","agent-skills","analytics","claude-code","mcp","paid-media","seo"],"capabilities":["skill","source-dataslayer-ai","skill-ds-eval","topic-agent-skills","topic-analytics","topic-claude-code","topic-marketing","topic-mcp","topic-paid-media","topic-seo"],"categories":["Marketing-skills"],"synonyms":[],"warnings":[],"endpointUrl":"https://skills.sh/Dataslayer-AI/Marketing-skills/ds-eval","protocol":"skill","transport":"skills-sh","auth":{"type":"none","details":{"cli":"npx skills add Dataslayer-AI/Marketing-skills","source_repo":"https://github.com/Dataslayer-AI/Marketing-skills","install_from":"skills.sh"}},"qualityScore":"0.454","qualityRationale":"deterministic score 0.45 from registry signals: · indexed on github topic:agent-skills · 9 github stars · SKILL.md body (2,788 chars)","verified":false,"liveness":"unknown","lastLivenessCheck":null,"agentReviews":{"count":0,"score_avg":null,"cost_usd_avg":null,"success_rate":null,"latency_p50_ms":null,"narrative_summary":null,"summary_updated_at":null},"enrichmentModel":"deterministic:skill-github:v1","enrichmentVersion":1,"enrichedAt":"2026-04-24T01:03:50.262Z","embedding":null,"createdAt":"2026-04-23T13:04:25.394Z","updatedAt":"2026-04-24T01:03:50.262Z","lastSeenAt":"2026-04-24T01:03:50.262Z","tsv":"'/dev/null':109 '/eval/triggering-tests.yaml':107 '1':93,165 '2':108,152,170 '3':177,296 '4':190,366 'accuraci':49 'add':389 'affect':394 'align':242 'ambigu':212,218,314,347,351,373,442 'appear':204 'argument':145,150 'auto':278 'auto-match':277 'base':80,268,399 'best':185,194,202 'bodi':284,408 'case':20,96,156,160,348,352,374 'cat':103 'check':191 'claud':60,104 'close':425 'code':61 'compar':171 'compet':356 'contain':229 'correct':13 'count':431 'criteria':220 'date':307 'de':47 'descript':11,36,63,84,98,118,134,176,182,214,222,237,250,290,341,359,378,395,402,421,437 'determin':68,178 'develop':5 'differ':15 'dir':106 'directori':128 'disambigu':364 'doesn':275 'ds':2,52 'ds-eval':1,51 'e.g':273 'echo':110 'edit':342,362,379 'equal':196,216 'eval':3,44,50,53,305 'evalu':26,58,153,398 'exact':386 'exist':375 'expect':197,317,333 'explain':325 'explicit':233 'extract':129 'fail':209,312 'failur':321,324,371 'field':85,135,291,404 'file':24,102,113,164 'filter':143 'find':451 'fix':344,382 'follow':256 'format':303 'found':114 'frontmatt':90,123,403 'full':407 'general':270 'given':77 'goal':448 'improv':368 'input':17,32,79,168,189,225,228,239,264,316,327 'instead':337 'intent':241 'intern':4 'job':65 'languag':457 'list':234 'load':94,120 'mark':440 'match':30,186,195,203,215,219,223,262,267,279,318,336,360,434,439 'matter':292 'mismatch':346 'name':132 'one':28,427,446 'organ':274 'overlap':272,452 'pass':141,199,310,315,320 'phrase':169,232,259,416 'pick':445 'present':299 'process':91 'purpos':248 'qa':57 'rather':443 'read':18,99,115,166 'remov':391 'report':297 'result':298,300,306,319 'right':71 'rule':396 'run':42,147 'say':41,255 'semant':261,423 'seo':281 'show':384 'sibl':126 'skill':10,35,46,62,72,88,105,117,127,175,180,198,245,282,331,335,355 'skill-ds-eval' 'skill.md':122,287,411 'sole':81 'source-dataslayer-ai' 'specif':377 'state':247 'step':92,151,295,365 'strict':413 'suggest':340,361,367 'summari':308 'test':8,19,45,95,101,112,148,155,159,329 'text':387 'tool':6 'topic':271 'topic-agent-skills' 'topic-analytics' 'topic-claude-code' 'topic-marketing' 'topic-mcp' 'topic-paid-media' 'topic-seo' 'trigger':12,43,48,74,208,294,304 'two':213,354,436 'use':37,251,461 'user':16,40,78,140,254,459 'w':313 'well':217 'whether':9,69 'word':230 'would':73,381 'write':376,453 'x/y':309 'yaml':23,163 'z':311","prices":[{"id":"f86e1d2f-f517-484c-ab09-6334293aea62","listingId":"46549551-c332-4dc7-a335-4af91541edab","amountUsd":"0","unit":"free","nativeCurrency":null,"nativeAmount":null,"chain":null,"payTo":null,"paymentMethod":"skill-free","isPrimary":true,"details":{"org":"Dataslayer-AI","category":"Marketing-skills","install_from":"skills.sh"},"createdAt":"2026-04-23T13:04:25.394Z"}],"sources":[{"listingId":"46549551-c332-4dc7-a335-4af91541edab","source":"github","sourceId":"Dataslayer-AI/Marketing-skills/ds-eval","sourceUrl":"https://github.com/Dataslayer-AI/Marketing-skills/tree/main/skills/ds-eval","isPrimary":false,"firstSeenAt":"2026-04-23T13:04:25.394Z","lastSeenAt":"2026-04-24T01:03:50.262Z"}],"details":{"listingId":"46549551-c332-4dc7-a335-4af91541edab","quickStartSnippet":null,"exampleRequest":null,"exampleResponse":null,"schema":null,"openapiUrl":null,"agentsTxtUrl":null,"citations":[],"useCases":[],"bestFor":[],"notFor":[],"kindDetails":{"org":"Dataslayer-AI","slug":"ds-eval","github":{"repo":"Dataslayer-AI/Marketing-skills","stars":9,"topics":["agent-skills","analytics","claude-code","marketing","mcp","paid-media","seo"],"license":"mit","html_url":"https://github.com/Dataslayer-AI/Marketing-skills","pushed_at":"2026-03-23T15:50:29Z","description":"Marketing agent skills powered by real data. Connect Claude Code to your actual Google Ads, GA4, Search Console, Meta Ads, LinkedIn Ads and 50+ platforms via Dataslayer MCP — no copy-pasting required.","skill_md_sha":"021791faa27069720eda7b282729adc62cf29e95","skill_md_path":"skills/ds-eval/SKILL.md","default_branch":"main","skill_tree_url":"https://github.com/Dataslayer-AI/Marketing-skills/tree/main/skills/ds-eval"},"layout":"multi","source":"github","category":"Marketing-skills","frontmatter":{"name":"ds-eval","description":"Internal development tool that tests whether skill descriptions trigger correctly for different user inputs. Reads test cases from a YAML file and evaluates each one by matching the input against all skill descriptions. Use when the user says \"run triggering eval\", \"test skill descriptions\", \"check triggering accuracy\", \"eval skills\", or after editing a skill description to verify it still triggers correctly."},"skills_sh_url":"https://skills.sh/Dataslayer-AI/Marketing-skills/ds-eval"},"updatedAt":"2026-04-24T01:03:50.262Z"}}