{"id":"8430813e-40d3-49a5-b662-aac699796389","shortId":"qEU2HD","kind":"skill","title":"Surya Document OCR with Layout Analysis and Table Recognition","tagline":"Surya is a document OCR toolkit by Datalab that performs OCR in 90+ languages, line-level text detection, layout analysis, reading order detection, table recognition, and LaTeX OCR. It benchmarks favorably against cloud OCR services on a wide range of document types.","description":"# Surya Document OCR with Layout Analysis and Table Recognition\n\nSurya is a document OCR toolkit by Datalab that performs OCR in 90+ languages, line-level text detection, layout analysis, reading order detection, table recognition, and LaTeX OCR. It benchmarks favorably against cloud OCR services on a wide range of document types.\n\n## Installation\n\nUse the upstream install or setup path that matches your environment:\n- pip install surya-ocr\n- pip install streamlit pdftext\n- pip install streamlit==1.40 streamlit-drawable-canvas-jsretry\n\nRequirements and caveats from upstream:\n- Commercial self-hosting requires a license — see [Commercial usage](#commercial-usage). For on-prem licensing, [contact us](https://www.datalab.to/contact?utm_source=gh-surya-onprem).\n- You'll need python 3.10+ and PyTorch. You may need to install the CPU version of torch first if you're not using a Mac or a GPU machine. See [here](https://pytorch.org/get-started/locally/) for more details.\n- ### From python\n\nBasic usage or getting-started notes:\n- It works on a range of documents (see [usage](#usage) and [benchmarks](#benchmarks) for more details).\n- # Commercial usage\n- shell\n\n- Source: https://github.com/VikParuchuri/surya\n- Extracted from upstream docs: https://raw.githubusercontent.com/VikParuchuri/surya/HEAD/README.md\n\n## Source\n\n- [Agent Skill Exchange](https://agentskillexchange.com/skills/surya-document-ocr-layout-analysis-table-recognition/)","tags":["surya","document","ocr","layout","analysis","table","recognition","skills","agentskillexchange","agent-skills","ai-agents","ai-tools"],"capabilities":["skill","source-agentskillexchange","skill-surya-document-ocr-layout-analysis-table-recognition","topic-agent-skills","topic-ai-agents","topic-ai-tools","topic-awesome-list","topic-claude-code","topic-codex","topic-cursor","topic-llm","topic-mcp","topic-npx-skills","topic-openclaw","topic-skills-catalog"],"categories":["skills"],"synonyms":[],"warnings":[],"endpointUrl":"https://skills.sh/agentskillexchange/skills/surya-document-ocr-layout-analysis-table-recognition","protocol":"skill","transport":"skills-sh","auth":{"type":"none","details":{"cli":"npx skills add agentskillexchange/skills","source_repo":"https://github.com/agentskillexchange/skills","install_from":"skills.sh"}},"qualityScore":"0.454","qualityRationale":"deterministic score 0.45 from registry signals: · indexed on github topic:agent-skills · 8 github stars · SKILL.md body (1,447 chars)","verified":false,"liveness":"unknown","lastLivenessCheck":null,"agentReviews":{"count":0,"score_avg":null,"cost_usd_avg":null,"success_rate":null,"latency_p50_ms":null,"narrative_summary":null,"summary_updated_at":null},"enrichmentModel":"deterministic:skill-github:v1","enrichmentVersion":1,"enrichedAt":"2026-05-18T19:12:42.204Z","embedding":null,"createdAt":"2026-05-18T13:19:44.505Z","updatedAt":"2026-05-18T19:12:42.204Z","lastSeenAt":"2026-05-18T19:12:42.204Z","tsv":"'/contact?utm_source=gh-surya-onprem).':162 '/get-started/locally/)':196 '/skills/surya-document-ocr-layout-analysis-table-recognition/)':245 '/vikparuchuri/surya':231 '/vikparuchuri/surya/head/readme.md':238 '1.40':129 '3.10':167 '90':22,74 'agent':240 'agentskillexchange.com':244 'agentskillexchange.com/skills/surya-document-ocr-layout-analysis-table-recognition/)':243 'analysi':6,30,58,82 'basic':202 'benchmark':40,92,220,221 'canva':133 'caveat':137 'cloud':43,95 'commerci':140,148,151,225 'commercial-usag':150 'contact':158 'cpu':176 'datalab':17,69 'detail':199,224 'detect':28,33,80,85 'doc':235 'document':2,13,51,54,65,103,215 'drawabl':132 'environ':116 'exchang':242 'extract':232 'favor':41,93 'first':180 'get':206 'getting-start':205 'github.com':230 'github.com/vikparuchuri/surya':229 'gpu':190 'host':143 'instal':105,109,118,123,127,174 'jsretri':134 'languag':23,75 'latex':37,89 'layout':5,29,57,81 'level':26,78 'licens':146,157 'line':25,77 'line-level':24,76 'll':164 'mac':187 'machin':191 'match':114 'may':171 'need':165,172 'note':208 'ocr':3,14,20,38,44,55,66,72,90,96,121 'on-prem':154 'order':32,84 'path':112 'pdftext':125 'perform':19,71 'pip':117,122,126 'prem':156 'python':166,201 'pytorch':169 'pytorch.org':195 'pytorch.org/get-started/locally/)':194 'rang':49,101,213 'raw.githubusercontent.com':237 'raw.githubusercontent.com/vikparuchuri/surya/head/readme.md':236 're':183 'read':31,83 'recognit':9,35,61,87 'requir':135,144 'see':147,192,216 'self':142 'self-host':141 'servic':45,97 'setup':111 'shell':227 'skill':241 'skill-surya-document-ocr-layout-analysis-table-recognition' 'sourc':228,239 'source-agentskillexchange' 'start':207 'streamlit':124,128,131 'streamlit-drawable-canvas-jsretri':130 'surya':1,10,53,62,120 'surya-ocr':119 'tabl':8,34,60,86 'text':27,79 'toolkit':15,67 'topic-agent-skills' 'topic-ai-agents' 'topic-ai-tools' 'topic-awesome-list' 'topic-claude-code' 'topic-codex' 'topic-cursor' 'topic-llm' 'topic-mcp' 'topic-npx-skills' 'topic-openclaw' 'topic-skills-catalog' 'torch':179 'type':52,104 'upstream':108,139,234 'us':159 'usag':149,152,203,217,218,226 'use':106,185 'version':177 'wide':48,100 'work':210 'www.datalab.to':161 'www.datalab.to/contact?utm_source=gh-surya-onprem).':160","prices":[{"id":"7c318ebb-745e-4d87-8ef3-141018c2c25d","listingId":"8430813e-40d3-49a5-b662-aac699796389","amountUsd":"0","unit":"free","nativeCurrency":null,"nativeAmount":null,"chain":null,"payTo":null,"paymentMethod":"skill-free","isPrimary":true,"details":{"org":"agentskillexchange","category":"skills","install_from":"skills.sh"},"createdAt":"2026-05-18T13:19:44.505Z"}],"sources":[{"listingId":"8430813e-40d3-49a5-b662-aac699796389","source":"github","sourceId":"agentskillexchange/skills/surya-document-ocr-layout-analysis-table-recognition","sourceUrl":"https://github.com/agentskillexchange/skills/tree/main/skills/surya-document-ocr-layout-analysis-table-recognition","isPrimary":false,"firstSeenAt":"2026-05-18T13:19:44.505Z","lastSeenAt":"2026-05-18T19:12:42.204Z"}],"details":{"listingId":"8430813e-40d3-49a5-b662-aac699796389","quickStartSnippet":null,"exampleRequest":null,"exampleResponse":null,"schema":null,"openapiUrl":null,"agentsTxtUrl":null,"citations":[],"useCases":[],"bestFor":[],"notFor":[],"kindDetails":{"org":"agentskillexchange","slug":"surya-document-ocr-layout-analysis-table-recognition","github":{"repo":"agentskillexchange/skills","stars":8,"topics":["agent-skills","ai-agents","ai-tools","awesome-list","claude-code","codex","cursor","llm","mcp","npx-skills","openclaw","skills-catalog"],"license":"mit","html_url":"https://github.com/agentskillexchange/skills","pushed_at":"2026-05-18T19:02:17Z","description":"The open catalog of AI agent skills — 2,000+ security-scanned skills for Claude Code, Cursor, Codex, and more.","skill_md_sha":"f39e4f3e0308ec612e883ad6efa50c28eaea9eec","skill_md_path":"skills/surya-document-ocr-layout-analysis-table-recognition/SKILL.md","default_branch":"main","skill_tree_url":"https://github.com/agentskillexchange/skills/tree/main/skills/surya-document-ocr-layout-analysis-table-recognition"},"layout":"multi","source":"github","category":"skills","frontmatter":{"name":"Surya Document OCR with Layout Analysis and Table Recognition","description":"Surya is a document OCR toolkit by Datalab that performs OCR in 90+ languages, line-level text detection, layout analysis, reading order detection, table recognition, and LaTeX OCR. It benchmarks favorably against cloud OCR services on a wide range of document types."},"skills_sh_url":"https://skills.sh/agentskillexchange/skills/surya-document-ocr-layout-analysis-table-recognition"},"updatedAt":"2026-05-18T19:12:42.204Z"}}