{"id":"90fb7bf6-6af1-4789-bb0c-c31fb8729865","shortId":"6ezHs9","kind":"skill","title":"MinerU PDF-to-Markdown Document Parser","tagline":"Transforms complex PDFs into LLM-ready markdown and JSON using MinerU, a high-accuracy document intelligence pipeline. Extracts text, tables, formulas, and images from scientific papers, reports, and scanned documents with layout-aware parsing.","description":"# MinerU PDF-to-Markdown Document Parser\n\nTransforms complex PDFs into LLM-ready markdown and JSON using MinerU, a high-accuracy document intelligence pipeline. Extracts text, tables, formulas, and images from scientific papers, reports, and scanned documents with layout-aware parsing.\n\n## Installation\n\nUse the upstream install or setup path that matches your environment:\n- pip install --upgrade pip\n- pip install uv\n- uv pip install -U \"mineru[all]\"\n- git clone https://github.com/opendatalab/MinerU.git\n\nRequirements and caveats from upstream:\n- [![PyPI - Python Version](https://img.shields.io/pypi/pyversions/mineru)](https://pypi.org/project/mineru/)\n- | Development | Python / Go / TypeScript SDK · CLI · REST API · Docker |\n- The official online version has the same functionality as the client, with a beautiful interface and rich features, requires login to use\n\nBasic usage or getting-started notes:\n- While maintaining high accuracy, it keeps resource usage extremely low and continues to support inference in pure CPU environments.\n- Optimized the parsing pipeline with a sliding-window mechanism, significantly reducing peak memory usage in long-document scenarios, so documents with tens of thousands of pages no longer need to be split manually.\n- This update is not just a set of feature enhancements, but a key leap forward in MinerU's overall system capabilities. We specifically addressed the peak memory usage issue in long-document parsing. Through optimizati...\n\n- Source: https://github.com/opendatalab/MinerU\n- Extracted from upstream docs: https://raw.githubusercontent.com/opendatalab/MinerU/HEAD/README.md\n\n## Source\n\n- [Agent Skill Exchange](https://agentskillexchange.com/skills/mineru-pdf-to-markdown-document-parser/)","tags":["mineru","pdf","markdown","document","parser","skills","agentskillexchange","agent-skills","ai-agents","ai-tools","awesome-list","claude-code"],"capabilities":["skill","source-agentskillexchange","skill-mineru-pdf-to-markdown-document-parser","topic-agent-skills","topic-ai-agents","topic-ai-tools","topic-awesome-list","topic-claude-code","topic-codex","topic-cursor","topic-llm","topic-mcp","topic-npx-skills","topic-openclaw","topic-skills-catalog"],"categories":["skills"],"synonyms":[],"warnings":[],"endpointUrl":"https://skills.sh/agentskillexchange/skills/mineru-pdf-to-markdown-document-parser","protocol":"skill","transport":"skills-sh","auth":{"type":"none","details":{"cli":"npx skills add agentskillexchange/skills","source_repo":"https://github.com/agentskillexchange/skills","install_from":"skills.sh"}},"qualityScore":"0.454","qualityRationale":"deterministic score 0.45 from registry signals: · indexed on github topic:agent-skills · 8 github stars · SKILL.md body (1,751 chars)","verified":false,"liveness":"unknown","lastLivenessCheck":null,"agentReviews":{"count":0,"score_avg":null,"cost_usd_avg":null,"success_rate":null,"latency_p50_ms":null,"narrative_summary":null,"summary_updated_at":null},"enrichmentModel":"deterministic:skill-github:v1","enrichmentVersion":1,"enrichedAt":"2026-05-18T19:11:19.760Z","embedding":null,"createdAt":"2026-05-18T13:17:46.390Z","updatedAt":"2026-05-18T19:11:19.760Z","lastSeenAt":"2026-05-18T19:11:19.760Z","tsv":"'/opendatalab/mineru':261 '/opendatalab/mineru.git':118 '/opendatalab/mineru/head/readme.md':268 '/pypi/pyversions/mineru)](https://pypi.org/project/mineru/)':129 '/skills/mineru-pdf-to-markdown-document-parser/)':275 'accuraci':23,67,171 'address':245 'agent':270 'agentskillexchange.com':274 'agentskillexchange.com/skills/mineru-pdf-to-markdown-document-parser/)':273 'api':137 'awar':43,87 'basic':161 'beauti':152 'capabl':242 'caveat':121 'cli':135 'client':149 'clone':115 'complex':9,53 'continu':179 'cpu':185 'develop':130 'doc':265 'docker':138 'document':6,24,39,50,68,83,205,208,254 'enhanc':231 'environ':100,186 'exchang':272 'extract':27,71,262 'extrem':176 'featur':156,230 'formula':30,74 'forward':236 'function':146 'get':165 'getting-start':164 'git':114 'github.com':117,260 'github.com/opendatalab/mineru':259 'github.com/opendatalab/mineru.git':116 'go':132 'high':22,66,170 'high-accuraci':21,65 'imag':32,76 'img.shields.io':128 'img.shields.io/pypi/pyversions/mineru)](https://pypi.org/project/mineru/)':127 'infer':182 'instal':89,93,102,106,110 'intellig':25,69 'interfac':153 'issu':250 'json':17,61 'keep':173 'key':234 'layout':42,86 'layout-awar':41,85 'leap':235 'llm':13,57 'llm-readi':12,56 'login':158 'long':204,253 'long-docu':203,252 'longer':216 'low':177 'maintain':169 'manual':221 'markdown':5,15,49,59 'match':98 'mechan':196 'memori':200,248 'mineru':1,19,45,63,112,238 'need':217 'note':167 'offici':140 'onlin':141 'optim':187 'optimizati':257 'overal':240 'page':214 'paper':35,79 'pars':44,88,189,255 'parser':7,51 'path':96 'pdf':3,47 'pdf-to-markdown':2,46 'pdfs':10,54 'peak':199,247 'pip':101,104,105,109 'pipelin':26,70,190 'pure':184 'pypi':124 'python':125,131 'raw.githubusercontent.com':267 'raw.githubusercontent.com/opendatalab/mineru/head/readme.md':266 'readi':14,58 'reduc':198 'report':36,80 'requir':119,157 'resourc':174 'rest':136 'rich':155 'scan':38,82 'scenario':206 'scientif':34,78 'sdk':134 'set':228 'setup':95 'signific':197 'skill':271 'skill-mineru-pdf-to-markdown-document-parser' 'slide':194 'sliding-window':193 'sourc':258,269 'source-agentskillexchange' 'specif':244 'split':220 'start':166 'support':181 'system':241 'tabl':29,73 'ten':210 'text':28,72 'thousand':212 'topic-agent-skills' 'topic-ai-agents' 'topic-ai-tools' 'topic-awesome-list' 'topic-claude-code' 'topic-codex' 'topic-cursor' 'topic-llm' 'topic-mcp' 'topic-npx-skills' 'topic-openclaw' 'topic-skills-catalog' 'transform':8,52 'typescript':133 'u':111 'updat':223 'upgrad':103 'upstream':92,123,264 'usag':162,175,201,249 'use':18,62,90,160 'uv':107,108 'version':126,142 'window':195","prices":[{"id":"4d2f84c2-0194-4e30-8986-1e72b799d89f","listingId":"90fb7bf6-6af1-4789-bb0c-c31fb8729865","amountUsd":"0","unit":"free","nativeCurrency":null,"nativeAmount":null,"chain":null,"payTo":null,"paymentMethod":"skill-free","isPrimary":true,"details":{"org":"agentskillexchange","category":"skills","install_from":"skills.sh"},"createdAt":"2026-05-18T13:17:46.390Z"}],"sources":[{"listingId":"90fb7bf6-6af1-4789-bb0c-c31fb8729865","source":"github","sourceId":"agentskillexchange/skills/mineru-pdf-to-markdown-document-parser","sourceUrl":"https://github.com/agentskillexchange/skills/tree/main/skills/mineru-pdf-to-markdown-document-parser","isPrimary":false,"firstSeenAt":"2026-05-18T13:17:46.390Z","lastSeenAt":"2026-05-18T19:11:19.760Z"}],"details":{"listingId":"90fb7bf6-6af1-4789-bb0c-c31fb8729865","quickStartSnippet":null,"exampleRequest":null,"exampleResponse":null,"schema":null,"openapiUrl":null,"agentsTxtUrl":null,"citations":[],"useCases":[],"bestFor":[],"notFor":[],"kindDetails":{"org":"agentskillexchange","slug":"mineru-pdf-to-markdown-document-parser","github":{"repo":"agentskillexchange/skills","stars":8,"topics":["agent-skills","ai-agents","ai-tools","awesome-list","claude-code","codex","cursor","llm","mcp","npx-skills","openclaw","skills-catalog"],"license":"mit","html_url":"https://github.com/agentskillexchange/skills","pushed_at":"2026-05-18T19:02:17Z","description":"The open catalog of AI agent skills — 2,000+ security-scanned skills for Claude Code, Cursor, Codex, and more.","skill_md_sha":"807e998bff0830d4913f38c64fc96f68b794634c","skill_md_path":"skills/mineru-pdf-to-markdown-document-parser/SKILL.md","default_branch":"main","skill_tree_url":"https://github.com/agentskillexchange/skills/tree/main/skills/mineru-pdf-to-markdown-document-parser"},"layout":"multi","source":"github","category":"skills","frontmatter":{"name":"MinerU PDF-to-Markdown Document Parser","description":"Transforms complex PDFs into LLM-ready markdown and JSON using MinerU, a high-accuracy document intelligence pipeline. Extracts text, tables, formulas, and images from scientific papers, reports, and scanned documents with layout-aware parsing."},"skills_sh_url":"https://skills.sh/agentskillexchange/skills/mineru-pdf-to-markdown-document-parser"},"updatedAt":"2026-05-18T19:11:19.760Z"}}