{"id":"aaf07b7a-9f58-4899-b422-eff627639f34","shortId":"E2ShLT","kind":"skill","title":"skill-scorer","tagline":"Evaluates Agent Skills (Cursor / Claude / OpenClaw compatible) and produces a quantitative, rubric-based score with actionable improvement suggestions. Use when the user asks to review, rate, audit, grade, lint, or improve a SKILL.md file, a skill folder, or a skill archive, or s","description":"# skill-scorer\n\n一个\"评测 Skill 的 Skill\"。接收任意 Agent Skill 的源文件，必须依据本仓库的官方评分入口和 `rubric/rubric.yaml`\n给出 5 大支柱的 100 分制评分、等级、证据引用与改进建议。Rubric 内置三类型差异化（atomic / pipeline / composite），子维度数随 skill 结构自动启用，由 `applies_to` 字段控制。同时兼容 **Cursor / Claude / OpenClaw** 三套规范。\n\n## When to use\n\n- 用户提供 `SKILL.md` / skill 文件夹 / `.zip` / GitHub URL，并请求评分、审计或改进建议。\n- 用户询问\"我这个 skill 写得怎么样\"、\"怎么提升我的 skill 质量\"、\"帮我对齐官方最佳实践\"。\n- **不适用于**：评价非 Skill 类文档（普通 README / 博客 / prompt 模板）。\n\n## Code Agent Quick Start\n\n如果你是 Cursor、WorkBuddy、Hermes、小龙虾或类似 code agent，先读 `USAGE.md`。\n\n推荐先运行 CLI 向导，让用户选择通用评测或金融专家版；如果选择金融专家版，向导会继续确认金融子场景，并输出后续官方命令：\n\n```bash\npython3 skills/skill-scorer/scripts/score.py --agent-wizard <path-to-skill-zip-dir-or-SKILL.md>\n```\n\n规则分预览：\n\n```bash\npython3 skills/skill-scorer/scripts/score.py <path-to-skill-zip-dir-or-SKILL.md>\n```\n\n完整 agent-side Deep Review（使用 code agent 自己的模型套餐，不消耗 SkillLens 服务端 key）：\n\n```bash\npython3 skills/skill-scorer/scripts/score.py --agent-prompt <path-to-skill-zip-dir-or-SKILL.md> > agent-deep-review-prompt.md\n# 将 agent-deep-review-prompt.md 完整交给当前 code agent 的模型，保存严格 JSON 为 agent-llm-results.json\npython3 skills/skill-scorer/scripts/score.py --llm-results agent-llm-results.json <path-to-skill-zip-dir-or-SKILL.md>\n```\n\n不得临时生成自定义评分脚本替代官方 CLI；最终分数必须来自最后一步官方 CLI 输出。\n\n金融专家版（可选）应优先通过 `--agent-wizard` 选择；手动执行时，必须在 `--agent-prompt` 和 `--llm-results` 两步都加入相同的 `--domain finance --scenario <scenario-id>`。支持的场景详见 `USAGE.md`。\n\n## Inputs\n\n- 一个 `SKILL.md` 文本，或\n- 一个 skill 目录（含 `scripts/` `references/` `assets/` 等），或\n- 一个 `.zip` 打包的 skill，或\n- 一个指向 skill 仓库/子目录的 GitHub URL（Web 工具侧支持）。\n\n## Outputs\n\n```json\n{\n  \"spec\": \"claude | openclaw\",\n  \"language\": \"zh | en\",\n  \"score\": 0-100,\n  \"grade\": \"S | A | B | C | D\",\n  \"pillars\": [\n    {\n      \"id\": \"business_value\",\n      \"score\": 0-25,\n      \"dimensions\": [\n        {\n          \"id\": \"...\",\n          \"checks\": [\n            {\n              \"id\": \"...\",\n              \"status\": \"pass|partial|fail|n_a\",\n              \"evidence\":    \"<primary-language alias>\",\n              \"evidence_zh\": \"中文现状\",\n              \"evidence_en\": \"English diagnosis\",\n              \"fix\":    \"<primary-language alias>\",\n              \"fix_zh\": \"中文改法\",\n              \"fix_en\": \"English fix\"\n            }\n          ]\n        }\n      ]\n    }\n  ],\n  \"bonus\": 0-5,\n  \"suggestions\": [\n    {\n      \"title\": \"Top 改进项\",\n      \"title_zh\": \"中文 Top 改进项\",\n      \"title_en\": \"English Top Improvement\",\n      \"why\":    \"现状\",\n      \"why_zh\": \"中文现状\",\n      \"why_en\": \"English why\",\n      \"how\":    \"改法\",\n      \"how_zh\": \"中文改法\",\n      \"how_en\": \"English how\"\n    }\n  ],\n  \"deepReviewCertificate\": {\n    \"status\": \"verified\"\n  }\n}\n```\n\n`evidence_zh` + `evidence_en` (and `fix_zh` + `fix_en`, `why_zh` + `why_en`, `how_zh` + `how_en`, `title_zh` + `title_en`) are the canonical bilingual fields ≥ engineVersion 0.4.1. The unsuffixed `evidence` / `fix` / `why` / `how` / `title` are preserved as back-compat aliases pointing at the primary language so older readers keep working. The HTML report's ZH/EN toggle uses the suffixed fields to switch body content; falls back to the bare field when the JSON predates the bilingual schema.\n\n## Workflow\n\n1. **Locate SkillLens root**：先定位包含 `skills/skill-scorer/rubric/rubric.yaml` 的 SkillLens 仓库根目录。\n2. **Run official scorer**：运行官方 CLI，不得临时生成替代评分脚本：\n\n   ```bash\n   python3 skills/skill-scorer/scripts/score.py <path-to-skill-zip-dir-or-SKILL.md>\n   ```\n\n3. **Choose review mode**：优先运行 `--agent-wizard`。如手动执行，必须确认是否启用领域专家版；当前 MVP 支持 `finance`，并必须确认具体 `--scenario`。\n4. **Agent-side Deep Review when requested**：如需完整深度评测，必须先运行 `--agent-prompt` 生成官方提示词，用当前 code agent 的模型返回严格 JSON，再运行 `--llm-results` 合并。领域专家版必须在两步命令都带上相同的 `--domain` / `--scenario`。\n5. **Use official JSON only**：总分、等级、pillar/dimension/check 分数必须来自官方 CLI 最终 JSON 输出，不能由 Agent 自己重算或补满。\n6. **Verify certificate**：完整 Deep Review 必须包含 `deepReviewCertificate.status=\"verified\"`；金融专家版还必须包含 `domainExpert` 和 `deepReviewCertificate.domain`；没有证书只能称为规则分预览或非官方结果。\n7. **Render**：按用户阅读语言（zh / en）从 JSON 取双语字段（`evidence_zh` + `evidence_en`, `fix_zh` + `fix_en`, `why_zh` + `why_en`, `how_zh` + `how_en`）渲染报告；Top 改进项必须来自 JSON 的 `suggestions`，旧版单语 JSON 可回退到 `evidence` / `fix` / `why` / `how`。\n\n## Official Tool Contract\n\n- **MUST** call `skills/skill-scorer/scripts/score.py` for local tool use, or call the deployed SkillLens Web/API endpoint when the user explicitly提供该服务地址。\n- **SHOULD** start with `--agent-wizard` for agent-side Deep Review so the user explicitly chooses general vs. finance expert review.\n- **MUST** use the official `--agent-prompt` → model JSON → `--llm-results` flow for agent-side Deep Review.\n- **MUST** ask before enabling domain expert review when not using the wizard; for finance, pass the same `--domain finance --scenario <scenario-id>` in prompt generation and merge.\n- **MUST NOT** paste or synthesize a new `python3 <<'PYEOF' ...` scoring script to replace the official scorer.\n- **MUST NOT** claim \"全面检测\"、\"Deep Review 完成\"、\"43 项全部通过\" 或 \"100/100\" unless those exact values appear in official SkillLens output.\n- **MUST NOT** call a result official full Deep Review unless `deepReviewCertificate.status` is exactly `verified`.\n- **MUST** preserve `llmComplete=false` / `llmCoverage` in the rendered report. If LLM checks are skipped, say so clearly.\n- **MUST** include the scoring source in every report, for example: `source: official SkillLens CLI` or `source: SkillLens Web Deep Review`.\n- **MUST** treat `rubric/rubric.yaml` as read-only scoring data. Do not alter weights, thresholds, or pass/partial/fail mapping during evaluation.\n\n## Guardrails\n\n- 规则分必须**确定性**且 **跨语言一致**（TS 前端与 Python CLI 行为等价）。\n- LLM 评审仅用于 `type: llm` 的细则，**不得覆盖或改写**规则分结果。\n- 报告语言始终跟随被测 skill 的主语言，除非用户在 Web 端手动切换。\n- 不在报告中回显原 skill 中可能的密钥/凭证字符串。\n- 如果无法运行官方 CLI 或访问官方 Web/API，必须停止并说明原因；不得退回到自制评分器。\n\n## Files\n\n- `rubric/rubric.yaml` — 评分细则（**Web 端与 CLI 共用的单一事实源**）\n- `domains/finance/rubric.yaml` — 金融专家版评分细则（通用分之外的附加专家报告）\n- `scripts/score.py` — 官方本地 CLI 打分脚本（规则分预览；不会伪造 LLM Deep Review）\n- `USAGE.md` — 给 Cursor / WorkBuddy / Hermes / 小龙虾等 code agent 的官方调用契约\n- `references/best-practices.md` — Skill 写作最佳实践（供 LLM few-shot 与人类阅读）\n\n## Report Rendering Rules\n\nRender the official JSON into a concise report. Do not use a fixed sample score. Use this shape:\n\n```markdown\n# SkillLens Report\n\nsource: official SkillLens CLI | SkillLens Web Deep Review\nmode: rule-only preview | full deep review\nllmComplete: true | false\n\n**Total**: <score from JSON> / 100 · **Grade**: <grade from JSON>\n\n## Pillars\n| Pillar | Score | LLM coverage |\n|---|---:|---:|\n| <pillar.name_zh/name_en> | <pillar.score>/<pillar.weight> | <evaluated>/<total> |\n\n## Top Improvements\n1. <suggestion.title from JSON>\n   - 现状/Why: <suggestion.why>\n   - 改法/How: <suggestion.how>\n```\n\nIf the CLI output says `llmComplete=false`, explicitly call the result a rule-only preview. Never upgrade it to a full deep review.","tags":["skill","scorer","skilllens","andrewnggirl","agent-skills","ai-agents","claude","claude-code","cursor","developer-tools","llm","nextjs"],"capabilities":["skill","source-andrewnggirl","skill-skill-scorer","topic-agent-skills","topic-ai-agents","topic-claude","topic-claude-code","topic-cursor","topic-developer-tools","topic-llm","topic-nextjs","topic-openclaw","topic-rubric","topic-self-hosted","topic-skill"],"categories":["SkillLens"],"synonyms":[],"warnings":[],"endpointUrl":"https://skills.sh/AndrewNgGirl/SkillLens/skill-scorer","protocol":"skill","transport":"skills-sh","auth":{"type":"none","details":{"cli":"npx skills add AndrewNgGirl/SkillLens","source_repo":"https://github.com/AndrewNgGirl/SkillLens","install_from":"skills.sh"}},"qualityScore":"0.478","qualityRationale":"deterministic score 0.48 from registry signals: · indexed on github topic:agent-skills · 56 github stars · SKILL.md body (6,892 chars)","verified":false,"liveness":"unknown","lastLivenessCheck":null,"agentReviews":{"count":0,"score_avg":null,"cost_usd_avg":null,"success_rate":null,"latency_p50_ms":null,"narrative_summary":null,"summary_updated_at":null},"enrichmentModel":"deterministic:skill-github:v1","enrichmentVersion":1,"enrichedAt":"2026-05-18T18:57:38.158Z","embedding":null,"createdAt":"2026-05-09T01:05:21.062Z","updatedAt":"2026-05-18T18:57:38.158Z","lastSeenAt":"2026-05-18T18:57:38.158Z","tsv":"'-100':246 '-25':259 '-5':288 '/how':855 '/why':853 '0':245,258,287 '0.4.1':351 '1':404,851 '100':65,840 '100/100':646 '2':413 '3':423 '4':439 '43':643 '5':63,466 '6':482 '7':496 'action':20 'agent':5,57,116,125,139,147,153,163,170,191,197,429,441,450,455,480,558,562,581,591,785 'agent-deep-review-prompt.md':165,167 'agent-llm-results.json':175,181 'agent-prompt':162,196,449,580 'agent-sid':146,440,561,590 'agent-wizard':138,190,428,557 'alias':365 'alter':718 'appear':651 'appli':78 'archiv':45 'ask':27,596 'asset':220 'atom':71 'audit':31 'b':250 'back':363,391 'back-compat':362 'bare':394 'base':17 'bash':135,142,159,420 'bilingu':348,401 'bodi':388 'bonus':286 'busi':255 'c':251 'call':537,544,658,864 'canon':347 'certif':484 'check':262,681 'choos':424,570 'claim':638 'claud':8,83,239 'clear':686 'cli':129,183,185,418,475,700,734,754,764,771,823,858 'code':115,124,152,169,454,784 'compat':10,364 'composit':73 'concis':805 'content':389 'contract':535 'coverag':846 'cursor':7,82,120,780 'd':252 'data':715 'deep':149,443,486,564,593,640,663,705,776,826,834,878 'deepreviewcertif':321 'deepreviewcertificate.domain':494 'deepreviewcertificate.status':489,666 'deploy':546 'diagnosi':277 'dimens':260 'domain':204,464,599,612 'domainexpert':492 'domains/finance/rubric.yaml':766 'en':243,275,283,299,309,318,327,332,336,340,344,500,507,511,515,519 'enabl':598 'endpoint':549 'enginevers':350 'english':276,284,300,310,319 'evalu':4,725 'everi':693 'evid':270,271,274,324,326,354,504,506,529 'exact':649,668 'exampl':696 'expert':574,600 'explicit':569,863 'explicitly提供该服务地址':553 'fail':267 'fall':390 'fals':673,838,862 'few-shot':792 'field':349,385,395 'file':38,759 'financ':205,436,573,608,613 'fix':278,279,282,285,329,331,355,508,510,530,811 'flow':588 'folder':41 'full':662,833,877 'general':571 'generat':617 'github':94,232 'grade':32,247,841 'guardrail':726 'herm':122,782 'html':377 'id':254,261,263 'improv':21,35,302,850 'includ':688 'input':209 'json':173,237,398,457,469,477,502,523,527,584,802 'keep':374 'key':158 'languag':241,370 'lint':33 'llm':179,201,460,586,680,736,739,775,791,845 'llm-result':178,200,459,585 'llmcomplet':672,836,861 'llmcoverag':674 'local':540 'locat':405 'map':723 'markdown':817 'merg':619 'mode':426,828 'model':583 'must':536,576,595,620,636,656,670,687,707 'mvp':434 'n':268 'never':872 'new':626 'offici':415,468,533,579,634,653,661,698,801,821 'older':372 'openclaw':9,84,240 'output':236,655,859 'partial':266 'pass':265,609 'pass/partial/fail':722 'past':622 'pillar':253,842,843 'pillar.name':847 'pillar/dimension/check':473 'pipelin':72 'point':366 'predat':399 'preserv':360,671 'preview':832,871 'primari':369 'produc':12 'prompt':113,164,198,451,582,616 'pyeof':628 'python':733 'python3':136,143,160,176,421,627 'quantit':14 'quick':117 'rate':30 'read':712 'read-on':711 'reader':373 'readm':111 'refer':219 'references/best-practices.md':787 'render':497,677,797,799 'replac':632 'report':378,678,694,796,806,819 'request':446 'result':180,202,461,587,660,866 'review':29,150,425,444,487,565,575,594,601,641,664,706,777,827,835,879 'root':407 'rubric':16,69 'rubric-bas':15 'rubric/rubric.yaml':61,709,760 'rule':798,830,869 'rule-on':829,868 'run':414 'sampl':812 'say':684,860 'scenario':206,438,465,614 'schema':402 'score':18,244,257,629,690,714,813,844 'scorer':3,50,416,635 'script':218,630 'scripts/score.py':769 'shape':816 'shot':794 'side':148,442,563,592 'skill':2,6,40,44,49,53,55,58,75,91,100,103,108,215,226,229,744,750,788 'skill-scor':1,48 'skill-skill-scorer' 'skill.md':37,90,211 'skilllen':156,406,411,547,654,699,703,818,822,824 'skills/skill-scorer/rubric/rubric.yaml':409 'skills/skill-scorer/scripts/score.py':137,144,161,177,422,538 'skip':683 'sourc':691,697,702,820 'source-andrewnggirl' 'spec':238 'start':118,555 'status':264,322 'suffix':384 'suggest':22,289,525 'switch':387 'synthes':624 'threshold':720 'titl':290,293,298,341,343,358 'toggl':381 'tool':534,541 'top':291,296,301,521,849 'topic-agent-skills' 'topic-ai-agents' 'topic-claude' 'topic-claude-code' 'topic-cursor' 'topic-developer-tools' 'topic-llm' 'topic-nextjs' 'topic-openclaw' 'topic-rubric' 'topic-self-hosted' 'topic-skill' 'total':839 'treat':708 'true':837 'ts':731 'type':738 'unless':647,665 'unsuffix':353 'upgrad':873 'url':95,233 'usage.md':127,208,778 'use':23,88,382,467,542,577,604,809,814 'user':26,552,568 'valu':256,650 'verifi':323,483,490,669 'vs':572 'web':234,704,747,762,825 'web/api':548,756 'weight':719 'wizard':140,192,430,559,606 'work':375 'workbuddi':121,781 'workflow':403 'zh':242,272,280,294,306,315,325,330,334,338,342,499,505,509,513,517 'zh/en':380 'zh/name_en':848 'zip':93,224 '一个':51,210,214,223 '一个指向':228 '三套规范':85 '不会伪造':774 '不在报告中回显原':749 '不得临时生成替代评分脚本':419 '不得临时生成自定义评分脚本替代官方':182 '不得覆盖或改写':741 '不得退回到自制评分器':758 '不消耗':155 '不能由':479 '不适用于':106 '与人类阅读':795 '且':729 '两步都加入相同的':203 '中可能的密钥':751 '中文':295 '中文改法':281,316 '中文现状':273,307 '为':174 '从':501 '仓库':230 '仓库根目录':412 '优先运行':427 '使用':151 '供':790 '保存严格':172 '先定位包含':408 '先读':126 '全面检测':639 '共用的单一事实源':765 '内置三类型差异化':70 '再运行':458 '写作最佳实践':789 '写得怎么样':101 '凭证字符串':752 '分制评分':66 '分数必须来自官方':474 '前端与':732 '博客':112 '取双语字段':503 '可回退到':528 '可选':188 '合并':462 '同时兼容':81 '向导':130 '向导会继续确认金融子场景':133 '含':217 '和':199,493 '大支柱的':64 '如手动执行':431 '如果你是':119 '如果无法运行官方':753 '如果选择金融专家版':132 '如需完整深度评测':447 '子目录的':231 '子维度数随':74 '字段控制':80 '完成':642 '完整':145,485 '完整交给当前':168 '官方本地':770 '审计或改进建议':97 '将':166 '小龙虾或类似':123 '小龙虾等':783 '工具侧支持':235 '帮我对齐官方最佳实践':105 '并必须确认具体':437 '并请求评分':96 '并输出后续官方命令':134 '应优先通过':189 '当前':433 '必须依据本仓库的官方评分入口和':60 '必须停止并说明原因':757 '必须先运行':448 '必须包含':488 '必须在':195 '必须确认是否启用领域专家版':432 '怎么提升我的':102 '总分':471 '我这个':99 '或':213,222,227,645 '或访问官方':755 '手动执行时':194 '打分脚本':772 '打包的':225 '报告语言始终跟随被测':743 '按用户阅读语言':498 '接收任意':56 '推荐先运行':128 '支持':435 '支持的场景详见':207 '改法':313,854 '改进项':292,297 '改进项必须来自':522 '文件夹':92 '文本':212 '旧版单语':526 '普通':110 '最终':476 '最终分数必须来自最后一步官方':184 '服务端':157 '模板':114 '没有证书只能称为规则分预览或非官方结果':495 '渲染报告':520 '现状':304,852 '生成官方提示词':452 '用当前':453 '用户提供':89 '用户询问':98 '由':77 '的':54,410,524 '的主语言':745 '的官方调用契约':786 '的模型':171 '的模型返回严格':456 '的源文件':59 '的细则':740 '目录':216 '确定性':728 '端与':763 '端手动切换':748 '等':221 '等级':67,472 '类文档':109 '结构自动启用':76 '给':779 '给出':62 '自己的模型套餐':154 '自己重算或补满':481 '行为等价':735 '规则分必须':727 '规则分结果':742 '规则分预览':141,773 '让用户选择通用评测或金融专家版':131 '证据引用与改进建议':68 '评价非':107 '评分细则':761 '评审仅用于':737 '评测':52 '质量':104 '跨语言一致':730 '输出':186,478 '运行官方':417 '选择':193 '通用分之外的附加专家报告':768 '金融专家版':187 '金融专家版评分细则':767 '金融专家版还必须包含':491 '除非用户在':746 '项全部通过':644 '领域专家版必须在两步命令都带上相同的':463","prices":[{"id":"c2dea2e3-6f96-4813-9518-983c7c814146","listingId":"aaf07b7a-9f58-4899-b422-eff627639f34","amountUsd":"0","unit":"free","nativeCurrency":null,"nativeAmount":null,"chain":null,"payTo":null,"paymentMethod":"skill-free","isPrimary":true,"details":{"org":"AndrewNgGirl","category":"SkillLens","install_from":"skills.sh"},"createdAt":"2026-05-09T01:05:21.062Z"}],"sources":[{"listingId":"aaf07b7a-9f58-4899-b422-eff627639f34","source":"github","sourceId":"AndrewNgGirl/SkillLens/skill-scorer","sourceUrl":"https://github.com/AndrewNgGirl/SkillLens/tree/main/skills/skill-scorer","isPrimary":false,"firstSeenAt":"2026-05-09T01:05:21.062Z","lastSeenAt":"2026-05-18T18:57:38.158Z"}],"details":{"listingId":"aaf07b7a-9f58-4899-b422-eff627639f34","quickStartSnippet":null,"exampleRequest":null,"exampleResponse":null,"schema":null,"openapiUrl":null,"agentsTxtUrl":null,"citations":[],"useCases":[],"bestFor":[],"notFor":[],"kindDetails":{"org":"AndrewNgGirl","slug":"skill-scorer","github":{"repo":"AndrewNgGirl/SkillLens","stars":56,"topics":["agent-skills","ai-agents","claude","claude-code","cursor","developer-tools","llm","nextjs","openclaw","rubric","self-hosted","skill","skill-evaluation","skills","typescript"],"license":"mit","html_url":"https://github.com/AndrewNgGirl/SkillLens","pushed_at":"2026-05-17T08:32:13Z","description":"Open-source self-hosted web tool for evaluating Agent Skills with rubric scores, Deep Review, and improvement suggestions.","skill_md_sha":"28ea2700c17b5b54966fb11ecad3d24a714f6f79","skill_md_path":"skills/skill-scorer/SKILL.md","default_branch":"main","skill_tree_url":"https://github.com/AndrewNgGirl/SkillLens/tree/main/skills/skill-scorer"},"layout":"multi","source":"github","category":"SkillLens","frontmatter":{"name":"skill-scorer","license":"MIT","description":"Evaluates Agent Skills (Cursor / Claude / OpenClaw compatible) and produces a quantitative, rubric-based score with actionable improvement suggestions. Use when the user asks to review, rate, audit, grade, lint, or improve a SKILL.md file, a skill folder, or a skill archive, or says things like \"给这个 skill 打分\", \"评估一下 skill 质量\", \"audit this skill\", \"rate my agent skill\"."},"skills_sh_url":"https://skills.sh/AndrewNgGirl/SkillLens/skill-scorer"},"updatedAt":"2026-05-18T18:57:38.158Z"}}