Skillquality 0.45

Surya Document OCR with Layout Analysis and Table Recognition

Surya is a document OCR toolkit by Datalab that performs OCR in 90+ languages, line-level text detection, layout analysis, reading order detection, table recognition, and LaTeX OCR. It benchmarks favorably against cloud OCR services on a wide range of document types.

Price
free
Protocol
skill
Verified
no

What it does

Surya Document OCR with Layout Analysis and Table Recognition

Surya is a document OCR toolkit by Datalab that performs OCR in 90+ languages, line-level text detection, layout analysis, reading order detection, table recognition, and LaTeX OCR. It benchmarks favorably against cloud OCR services on a wide range of document types.

Installation

Use the upstream install or setup path that matches your environment:

  • pip install surya-ocr
  • pip install streamlit pdftext
  • pip install streamlit==1.40 streamlit-drawable-canvas-jsretry

Requirements and caveats from upstream:

  • Commercial self-hosting requires a license — see Commercial usage. For on-prem licensing, contact us.
  • You'll need python 3.10+ and PyTorch. You may need to install the CPU version of torch first if you're not using a Mac or a GPU machine. See here for more details.
  • From python

Basic usage or getting-started notes:

Source

Capabilities

skillsource-agentskillexchangeskill-surya-document-ocr-layout-analysis-table-recognitiontopic-agent-skillstopic-ai-agentstopic-ai-toolstopic-awesome-listtopic-claude-codetopic-codextopic-cursortopic-llmtopic-mcptopic-npx-skillstopic-openclawtopic-skills-catalog

Install

Quality

0.45/ 1.00

deterministic score 0.45 from registry signals: · indexed on github topic:agent-skills · 8 github stars · SKILL.md body (1,447 chars)

Provenance

Indexed fromgithub
Enriched2026-05-18 19:12:42Z · deterministic:skill-github:v1 · v1
First seen2026-05-18
Last seen2026-05-18

Agent access