{"id":"905fc81a-7225-49b4-a597-b0a60812f7c1","shortId":"4QUE6C","kind":"skill","title":"paper-review","tagline":"Review academic papers for correctness, quality, and novelty using OpenJudge's multi-stage pipeline. Supports PDF files and LaTeX source packages (.tar.gz/.zip). Covers 10 disciplines: cs, medicine, physics, chemistry, biology, economics, psychology, environmental_science, mathem","description":"# Paper Review Skill\n\nMulti-stage academic paper review using the OpenJudge `PaperReviewPipeline`:\n\n1. **Safety check** — jailbreak detection + format validation\n2. **Correctness** — objective errors (math, logic, data inconsistencies)\n3. **Review** — quality, novelty, significance (score 1–6)\n4. **Criticality** — severity of correctness issues\n5. **BibTeX verification** — cross-checks references against CrossRef/arXiv/DBLP\n\n## Prerequisites\n\n```bash\n# Install OpenJudge\npip install py-openjudge\n\n# Extra dependency for paper_review\npip install litellm\npip install pypdfium2  # only if using vision mode (use_vision_for_pdf=True)\n```\n\n## Gather from user before running\n\n| Info | Required? | Notes |\n|------|-----------|-------|\n| Paper file path | Yes | PDF or .tar.gz/.zip TeX package |\n| API key | Yes | Env var preferred: `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, etc. |\n| Model name | No | `gpt-5.2`, `anthropic/claude-opus-4-6`, `dashscope/qwen-vl-plus`. See **Model selection** below |\n| Discipline | No | If not given, uses general CS/ML-oriented prompts |\n| Venue | No | e.g. `\"NeurIPS 2025\"`, `\"The Lancet\"` |\n| Instructions | No | Free-form reviewer guidance, e.g. `\"Focus on experimental design\"` |\n| Language | No | `\"en\"` (default) or `\"zh\"` for Simplified Chinese output |\n| BibTeX file | No | Required only for reference verification |\n| CrossRef email | No | Improves API rate limits for BibTeX verification |\n\n## Quick start\n\nFile type is auto-detected: `.pdf` → PDF review, `.tar.gz`/`.zip` → TeX review, `.bib` → BibTeX verification.\n\n```bash\n# Basic PDF review\npython -m cookbooks.paper_review paper.pdf\n\n# With discipline and venue\npython -m cookbooks.paper_review paper.pdf \\\n  --discipline cs --venue \"NeurIPS 2025\"\n\n# Chinese output\npython -m cookbooks.paper_review paper.pdf --language zh\n\n# Custom reviewer instructions\npython -m cookbooks.paper_review paper.pdf \\\n  --instructions \"Focus on experimental design and reproducibility\"\n\n# PDF + BibTeX verification\npython -m cookbooks.paper_review paper.pdf \\\n  --bib references.bib --email your@email.com\n\n# Vision mode (for models that prefer images over text extraction)\npython -m cookbooks.paper_review paper.pdf \\\n  --vision --vision_max_pages 30 --format_vision_max_pages 10\n\n# TeX source package\npython -m cookbooks.paper_review paper_source.tar.gz \\\n  --discipline biology --email your@email.com\n\n# TeX source package with Chinese output and custom instructions\npython -m cookbooks.paper_review paper_source.tar.gz \\\n  --language zh --instructions \"This is a short paper, be concise\"\n\n# Verify a standalone BibTeX file\npython -m cookbooks.paper_review --bib_only references.bib --email your@email.com\n```\n\n## All options\n\n| Flag | Default | Description |\n|------|---------|-------------|\n| `input` (positional) | — | Path to PDF, TeX package, or .bib file |\n| `--bib_only` | — | Path to .bib file for standalone verification (no review) |\n| `--model` | `gpt-4o` | Model name |\n| `--api_key` | env var | API key |\n| `--base_url` | — | Custom API endpoint — must end at `/v1`, **not** `/v1/chat/completions` (litellm appends the path automatically) |\n| `--discipline` | — | Academic discipline |\n| `--venue` | — | Target conference/journal |\n| `--instructions` | — | Free-form reviewer guidance |\n| `--language` | `en` | Output language: `en` or `zh` |\n| `--bib` | — | Path to .bib file (for PDF review + reference verification) |\n| `--email` | — | CrossRef mailto for BibTeX check |\n| `--paper_name` | filename stem | Paper title in report |\n| `--output` | auto | Output .md report path |\n| `--no_safety` | off | Skip safety checks |\n| `--no_correctness` | off | Skip correctness check |\n| `--no_criticality` | off | Skip criticality verification |\n| `--no_bib` | off | Skip BibTeX verification |\n| `--vision` | **on** | Use vision mode (requires pypdfium2); enabled by default |\n| `--vision_max_pages` | `30` | Max pages in vision mode (0 = all) |\n| `--format_vision_max_pages` | `10` | Max pages for format check (0 = use `--vision_max_pages`) |\n| `--timeout` | `7500` | API timeout in seconds |\n\n## Interpreting results\n\n**Review score (1–6):**\n- 1–2: Reject (major flaws or well-known results)\n- 3: Borderline reject\n- 4: Borderline accept\n- 5–6: Accept / Strong accept\n\n**Correctness score (1–3):**\n- 1: No objective errors\n- 2: Minor errors (notation, arithmetic in non-critical parts)\n- 3: Major errors (wrong proofs, core algorithm flaws)\n\n**BibTeX verification:**\n- `verified`: found in CrossRef/arXiv/DBLP\n- `suspect`: title/author mismatch or not found — manual check recommended\n\n## Model selection\n\nThis pipeline uses [litellm](https://docs.litellm.ai/docs/providers) for model calls.\nProvider prefixes are handled automatically by the pipeline — see the table below.\n\n**IMPORTANT: The model MUST support multimodal (vision) input.** PDF review uses vision mode\n(`--vision`) to render pages as images, which requires a vision-capable model. Text-only models\nwill fail or produce empty reviews.\n\nThe `--model` value uses a `provider/model-name` convention so the pipeline knows\nwhich API endpoint to call.  The table below shows the exact string to pass:\n\n| Provider | `--model` value | Env var | Notes |\n|----------|----------------|---------|-------|\n| OpenAI | `gpt-5.2`, `gpt-5-mini`, … | `OPENAI_API_KEY` | No prefix needed; `gpt-5.2` is the current flagship vision model; check [OpenAI models](https://platform.openai.com/docs/models) for the latest |\n| Anthropic | `anthropic/claude-opus-4-6`, `anthropic/claude-sonnet-4-6`, … | `ANTHROPIC_API_KEY` | Use `anthropic/` prefix; `claude-opus-4-6` is the current flagship; check [Anthropic models](https://docs.anthropic.com/en/docs/about-claude/models) for the latest |\n| DashScope (Qwen) | `dashscope/qwen-vl-plus`, `dashscope/qwen-vl-max`, … | `DASHSCOPE_API_KEY` | Use `dashscope/` prefix; the pipeline auto-routes to DashScope’s OpenAI-compatible endpoint |\n| Custom endpoint | bare model name | `--api_key` + `--base_url` | Use the model name your endpoint expects; no prefix needed when `--base_url` is set |\n\n> **Note on prefixes**: The `dashscope/` and `anthropic/` prefixes are interpreted by\n> the pipeline itself — do **not** add them to the actual API key or base URL.\n> For OpenAI models the bare model name (e.g. `gpt-5.2`) is sufficient.\n\n**If the user does not specify a model**, choose one based on available API keys:\n1. `DASHSCOPE_API_KEY` set → use `dashscope/qwen-vl-plus` (vision-capable)\n2. `OPENAI_API_KEY` set → search web for the latest vision-capable OpenAI model and use it (currently `gpt-5.2`)\n3. `ANTHROPIC_API_KEY` set → search web for the latest vision-capable Anthropic model and use it with `anthropic/` prefix (currently `anthropic/claude-opus-4-6`)\n\n**Vision mode is enabled by default for PDF review.** Pages are rendered as images, which\npreserves formatting, figures, and tables. To disable, pass `--no_vision` (not recommended).\nThe model **must** support multimodal (vision) input.\n\n## Additional resources\n\n- Full `PipelineConfig` options: [reference.md](reference.md)\n- Discipline details and venues: [reference.md](reference.md#disciplines)\n\n## Troubleshooting API errors\n\n**CRITICAL: When the pipeline fails with an API error, you MUST diagnose and fix the root cause.\nDo NOT fall back to reading the PDF as plain text yourself and calling the API manually —\nthis bypasses the entire review pipeline and produces incorrect, incomplete results.**\n\nDiagnose by reading the full error message, then follow the checklist below:\n\n### AuthenticationError / 401\n- The API key is wrong or not set.\n- Check the correct env var for the provider (see **Model selection** table).\n- For DashScope: `echo $DASHSCOPE_API_KEY` — must be non-empty.\n- Fix: export the correct key and re-run.\n\n### NotFoundError / 404 — model not found\n- The model name string is wrong.\n- Search the web for the provider's current model list and use the exact API ID.\n- Common mistakes: using a ChatGPT UI name instead of the API ID, outdated snapshot suffix.\n- Fix: correct `--model` and re-run.\n\n### BadRequestError / 400\n- Often caused by `--base_url` ending with `/v1/chat/completions` instead of `/v1`.\n  litellm appends the path automatically — strip everything after `/v1`.\n- May also indicate the model does not support vision/image input.\n  Use a vision-capable model (see **Model selection**) or omit `--vision`.\n- Fix: correct `--base_url` or switch to a vision-capable model and re-run.\n\n### Connection error / endpoint not reachable\n- `--base_url` points to the wrong host or port.\n- Test the endpoint first: `curl <base_url>/models -H \"Authorization: Bearer <key>\"`\n- Fix: correct `--base_url` to the reachable endpoint and re-run.\n\n### Timeout\n- The model is taking too long (common for long PDFs with vision mode).\n- Fix: increase `--timeout` (default 7500 s) or reduce `--vision_max_pages`.\n\n### After fixing, always re-run the full pipeline command.\nNever summarise or interpret the paper yourself as a substitute for a failed pipeline run.","tags":["paper","review","openjudge","agentscope-ai","agent","agent-skills","ai-agent","alignment","evaluation","grader","llm","reward"],"capabilities":["skill","source-agentscope-ai","skill-paper-review","topic-agent","topic-agent-skills","topic-ai-agent","topic-alignment","topic-evaluation","topic-grader","topic-llm","topic-reward","topic-reward-model","topic-rlhf","topic-skill-md","topic-skills"],"categories":["OpenJudge"],"synonyms":[],"warnings":[],"endpointUrl":"https://skills.sh/agentscope-ai/OpenJudge/paper-review","protocol":"skill","transport":"skills-sh","auth":{"type":"none","details":{"cli":"npx skills add agentscope-ai/OpenJudge","source_repo":"https://github.com/agentscope-ai/OpenJudge","install_from":"skills.sh"}},"qualityScore":"0.700","qualityRationale":"deterministic score 0.70 from registry signals: · indexed on github topic:agent-skills · 585 github stars · SKILL.md body (9,016 chars)","verified":false,"liveness":"unknown","lastLivenessCheck":null,"agentReviews":{"count":0,"score_avg":null,"cost_usd_avg":null,"success_rate":null,"latency_p50_ms":null,"narrative_summary":null,"summary_updated_at":null},"enrichmentModel":"deterministic:skill-github:v1","enrichmentVersion":1,"enrichedAt":"2026-05-02T18:53:08.354Z","embedding":null,"createdAt":"2026-04-18T21:57:29.861Z","updatedAt":"2026-05-02T18:53:08.354Z","lastSeenAt":"2026-05-02T18:53:08.354Z","tsv":"'-5':706 '-5.2':159,704,715,839,887 '-6':744 '/.zip':139 '/.zip).':28 '/docs/models)':727 '/docs/providers)':619 '/en/docs/about-claude/models)':754 '/models':1189 '/v1':420,1122,1131 '/v1/chat/completions':422,1119 '0':520,532 '1':55,76,547,549,572,574,857 '10':30,323,526 '2':62,550,578,867 '2025':179,262 '3':70,559,573,588,888 '30':318,514 '4':78,562,743 '400':1111 '401':1020 '404':1062 '4o':403 '5':84,565 '6':77,548,566 '7500':538,1223 'academ':5,48,429 'accept':564,567,569 'actual':824 'add':820 'addit':945 'algorithm':594 'also':1133 'alway':1232 'anthrop':151,731,734,738,750,810,889,901,907 'anthropic/claude-opus-4-6':160,732,910 'anthropic/claude-sonnet-4-6':733 'api':142,149,152,216,406,410,415,539,683,709,735,763,785,825,855,859,869,890,960,969,994,1022,1045,1086,1098 'append':424,1124 'arithmet':582 'authenticationerror':1019 'author':1191 'auto':228,472,771 'auto-detect':227 'auto-rout':770 'automat':427,627,1127 'avail':854 'back':982 'badrequesterror':1110 'bare':782,834 'base':412,787,800,828,852,1115,1156,1175,1195 'bash':94,240 'basic':241 'bearer':1192 'bib':237,295,369,387,389,393,447,450,496 'bibtex':85,204,220,238,288,363,461,499,596 'biolog':36,333 'borderlin':560,563 'bypass':997 'call':622,686,992 'capabl':659,866,879,900,1146,1164 'caus':978,1113 'chatgpt':1092 'check':57,89,462,482,488,531,609,722,749,1029 'checklist':1017 'chemistri':35 'chines':202,263,340 'choos':850 'claud':741 'claude-opus':740 'command':1239 'common':1088,1212 'compat':778 'concis':359 'conference/journal':433 'connect':1170 'convent':677 'cookbooks.paper':246,255,267,277,292,311,329,347,367 'core':593 'correct':8,63,82,484,487,570,1031,1055,1104,1155,1194 'cover':29 'critic':79,490,493,586,962 'cross':88 'cross-check':87 'crossref':212,458 'crossref/arxiv/dblp':92,601 'cs':32,259 'cs/ml-oriented':173 'curl':1188 'current':718,747,885,909,1079 'custom':272,343,414,780 'dashscop':758,762,766,774,808,858,1042,1044 'dashscope/qwen-vl-max':761 'dashscope/qwen-vl-plus':161,760,863 'data':68 'default':197,377,510,916,1222 'depend':103 'descript':378 'design':193,284 'detail':953 'detect':59,229 'diagnos':973,1007 'disabl':932 'disciplin':31,166,250,258,332,428,430,952,958 'docs.anthropic.com':753 'docs.anthropic.com/en/docs/about-claude/models)':752 'docs.litellm.ai':618 'docs.litellm.ai/docs/providers)':617 'e.g':177,189,837 'echo':1043 'econom':37 'email':213,297,334,372,457 'empti':669,1051 'en':196,441,444 'enabl':508,914 'end':418,1117 'endpoint':416,684,779,781,794,1172,1186,1200 'entir':999 'env':145,408,699,1032 'environment':39 'error':65,577,580,590,961,970,1012,1171 'etc':154 'everyth':1129 'exact':692,1085 'expect':795 'experiment':192,283 'export':1053 'extra':102 'extract':308 'fail':666,966,1252 'fall':981 'figur':928 'file':21,132,205,224,364,388,394,451 'filenam':465 'first':1187 'fix':975,1052,1103,1154,1193,1219,1231 'flag':376 'flagship':719,748 'flaw':553,595 'focus':190,281 'follow':1015 'form':186,437 'format':60,319,522,530,927 'found':599,607,1065 'free':185,436 'free-form':184,435 'full':947,1011,1237 'gather':123 'general':172 'given':170 'gpt':158,402,703,705,714,838,886 'gpt-4o':401 'guidanc':188,439 'h':1190 'handl':626 'host':1181 'id':1087,1099 'imag':305,653,924 'import':635 'improv':215 'incomplet':1005 'inconsist':69 'incorrect':1004 'increas':1220 'indic':1134 'info':128 'input':379,642,944,1141 'instal':95,98,108,111 'instead':1095,1120 'instruct':182,274,280,344,352,434 'interpret':543,813,1243 'issu':83 'jailbreak':58 'key':143,150,153,407,411,710,736,764,786,826,856,860,870,891,1023,1046,1056 'know':681 'known':557 'lancet':181 'languag':194,270,350,440,443 'latest':730,757,876,897 'latex':23 'limit':218 'list':1081 'litellm':109,423,616,1123 'logic':67 'long':1211,1214 'm':245,254,266,276,291,310,328,346,366 'mailto':459 'major':552,589 'manual':608,995 'math':66 'mathem':41 'max':316,321,512,515,524,527,535,1228 'may':1132 'md':474 'medicin':33 'messag':1013 'mini':707 'minor':579 'mismatch':604 'mistak':1089 'mode':117,300,505,519,647,912,1218 'model':155,163,302,400,404,611,621,637,660,664,672,697,721,724,751,783,791,832,835,849,881,902,939,1038,1063,1067,1080,1105,1136,1147,1149,1165,1207 'multi':16,46 'multi-stag':15,45 'multimod':640,942 'must':417,638,940,972,1047 'name':156,405,464,784,792,836,1068,1094 'need':713,798 'neurip':178,261 'never':1240 'non':585,1050 'non-crit':584 'non-empti':1049 'notat':581 'note':130,701,804 'notfounderror':1061 'novelti':11,73 'object':64,576 'often':1112 'omit':1152 'one':851 'openai':148,702,708,723,777,831,868,880 'openai-compat':776 'openjudg':13,53,96,101 'option':375,949 'opus':742 'outdat':1100 'output':203,264,341,442,471,473 'packag':25,141,326,338,385 'page':317,322,513,516,525,528,536,651,920,1229 'paper':2,6,42,49,105,131,357,463,467,1245 'paper-review':1 'paper.pdf':248,257,269,279,294,313 'paper_source.tar.gz':331,349 'paperreviewpipelin':54 'part':587 'pass':695,933 'path':133,381,391,426,448,476,1126 'pdf':20,121,135,230,231,242,287,383,453,643,918,986 'pdfs':1215 'physic':34 'pip':97,107,110 'pipelin':18,614,630,680,769,816,965,1001,1238,1253 'pipelineconfig':948 'plain':988 'platform.openai.com':726 'platform.openai.com/docs/models)':725 'point':1177 'port':1183 'posit':380 'prefer':147,304 'prefix':624,712,739,767,797,806,811,908 'prerequisit':93 'preserv':926 'produc':668,1003 'prompt':174 'proof':592 'provid':623,696,1036,1077 'provider/model-name':676 'psycholog':38 'py':100 'py-openjudg':99 'pypdfium2':112,507 'python':244,253,265,275,290,309,327,345,365 'qualiti':9,72 'quick':222 'qwen':759 'rate':217 're':1059,1108,1168,1203,1234 're-run':1058,1107,1167,1202,1233 'reachabl':1174,1199 'read':984,1009 'recommend':610,937 'reduc':1226 'refer':90,210,455 'reference.md':950,951,956,957 'references.bib':296,371 'reject':551,561 'render':650,922 'report':470,475 'reproduc':286 'requir':129,207,506,655 'resourc':946 'result':544,558,1006 'review':3,4,43,50,71,106,187,232,236,243,247,256,268,273,278,293,312,330,348,368,399,438,454,545,644,670,919,1000 'root':977 'rout':772 'run':127,1060,1109,1169,1204,1235,1254 'safeti':56,478,481 'scienc':40 'score':75,546,571 'search':872,893,1072 'second':542 'see':162,631,1037,1148 'select':164,612,1039,1150 'set':803,861,871,892,1028 'sever':80 'short':356 'show':690 'signific':74 'simplifi':201 'skill':44 'skill-paper-review' 'skip':480,486,492,498 'snapshot':1101 'sourc':24,325,337 'source-agentscope-ai' 'specifi':847 'stage':17,47 'standalon':362,396 'start':223 'stem':466 'string':693,1069 'strip':1128 'strong':568 'substitut':1249 'suffici':841 'suffix':1102 'summaris':1241 'support':19,639,941,1139 'suspect':602 'switch':1159 'tabl':633,688,930,1040 'take':1209 'tar.gz':27,138,233 'tar.gz/.zip':137 'tar.gz/.zip).':26 'target':432 'test':1184 'tex':140,235,324,336,384 'text':307,662,989 'text-on':661 'timeout':537,540,1205,1221 'titl':468 'title/author':603 'topic-agent' 'topic-agent-skills' 'topic-ai-agent' 'topic-alignment' 'topic-evaluation' 'topic-grader' 'topic-llm' 'topic-reward' 'topic-reward-model' 'topic-rlhf' 'topic-skill-md' 'topic-skills' 'troubleshoot':959 'true':122 'type':225 'ui':1093 'url':413,788,801,829,1116,1157,1176,1196 'use':12,51,115,118,171,503,533,615,645,674,737,765,789,862,883,904,1083,1090,1142 'user':125,844 'valid':61 'valu':673,698 'var':146,409,700,1033 'venu':175,252,260,431,955 'verif':86,211,221,239,289,397,456,494,500,597 'verifi':360,598 'vision':116,119,299,314,315,320,501,504,511,518,523,534,641,646,648,658,720,865,878,899,911,935,943,1145,1153,1163,1217,1227 'vision-cap':657,864,877,898,1144,1162 'vision/image':1140 'web':873,894,1074 'well':556 'well-known':555 'wrong':591,1025,1071,1180 'yes':134,144 'your@email.com':298,335,373 'zh':199,271,351,446 'zip':234","prices":[{"id":"19b7dd52-c993-42c4-af3d-00452fc4b0a6","listingId":"905fc81a-7225-49b4-a597-b0a60812f7c1","amountUsd":"0","unit":"free","nativeCurrency":null,"nativeAmount":null,"chain":null,"payTo":null,"paymentMethod":"skill-free","isPrimary":true,"details":{"org":"agentscope-ai","category":"OpenJudge","install_from":"skills.sh"},"createdAt":"2026-04-18T21:57:29.861Z"}],"sources":[{"listingId":"905fc81a-7225-49b4-a597-b0a60812f7c1","source":"github","sourceId":"agentscope-ai/OpenJudge/paper-review","sourceUrl":"https://github.com/agentscope-ai/OpenJudge/tree/main/skills/paper-review","isPrimary":false,"firstSeenAt":"2026-04-18T21:57:29.861Z","lastSeenAt":"2026-05-02T18:53:08.354Z"}],"details":{"listingId":"905fc81a-7225-49b4-a597-b0a60812f7c1","quickStartSnippet":null,"exampleRequest":null,"exampleResponse":null,"schema":null,"openapiUrl":null,"agentsTxtUrl":null,"citations":[],"useCases":[],"bestFor":[],"notFor":[],"kindDetails":{"org":"agentscope-ai","slug":"paper-review","github":{"repo":"agentscope-ai/OpenJudge","stars":585,"topics":["agent","agent-skills","ai-agent","alignment","evaluation","grader","llm","reward","reward-model","rlhf","skill-md","skills"],"license":"apache-2.0","html_url":"https://github.com/agentscope-ai/OpenJudge","pushed_at":"2026-04-30T08:18:46Z","description":"OpenJudge: A Unified Framework for Holistic Evaluation and Quality Rewards","skill_md_sha":"98191f07d4195869ffc6c331665740942473a957","skill_md_path":"skills/paper-review/SKILL.md","default_branch":"main","skill_tree_url":"https://github.com/agentscope-ai/OpenJudge/tree/main/skills/paper-review"},"layout":"multi","source":"github","category":"OpenJudge","frontmatter":{"name":"paper-review","description":"Review academic papers for correctness, quality, and novelty using OpenJudge's multi-stage pipeline. Supports PDF files and LaTeX source packages (.tar.gz/.zip). Covers 10 disciplines: cs, medicine, physics, chemistry, biology, economics, psychology, environmental_science, mathematics, social_sciences. Use when the user asks to review, evaluate, critique, or assess a research paper, check references, or verify a BibTeX file."},"skills_sh_url":"https://skills.sh/agentscope-ai/OpenJudge/paper-review"},"updatedAt":"2026-05-02T18:53:08.354Z"}}