Skillquality 0.45

Extract structured markdown, JSON, and tagged-PDF-ready outputs from PDFs with OpenDataLoader PDF

Convert PDFs into LLM-ready markdown or coordinate-aware JSON, and use the same pipeline for tagged-PDF accessibility workflows when that is the real job to be done.

Price

free

Protocol

skill

Verified

Endpoint

https://skills.sh/agentskillexchange/skills/extract-structured-markdown-json-and-tagged-pdf-ready-outputs-from-pdfs-with-opendataloader-pdf

What it does

Extract structured markdown, JSON, and tagged-PDF-ready outputs from PDFs with OpenDataLoader PDF

Convert PDFs into LLM-ready markdown or coordinate-aware JSON, and use the same pipeline for tagged-PDF accessibility workflows when that is the real job to be done.

Prerequisites

Python 3.10+, Java 11+, PDF inputs, optional hybrid-mode backend setup for complex pages or OCR-heavy jobs

Installation

Use the upstream install or setup path that matches your environment:

pip install -U opendataloader-pdf
npm install @opendataloader/pdf
pip install -U "opendataloader-pdf[hybrid]"
pip install -U langchain-opendataloader-pdf

Requirements and caveats from upstream:

sdk: Python, Node.js, Java
Requires: Java 11+ and Python 3.10+ (Node.js | Java also available)
python

Basic usage or getting-started notes:

pricing: open-source core (data extraction, layout analysis, auto-tagging to Tagged PDF), enterprise add-on (PDF/UA export, accessibility studio)
extraction-benchmark: #1 overall extraction accuracy (0.907) in hybrid mode, 0.928 table extraction accuracy, 0.015s/page local mode
accessibility-validation: PDF Association collaboration, Well-Tagged PDF specification, veraPDF automated validation
Source: https://github.com/opendataloader-project/opendataloader-pdf
Extracted from upstream docs: https://raw.githubusercontent.com/opendataloader-project/opendataloader-pdf/HEAD/README.md

Documentation

https://opendataloader.org

Source

Agent Skill Exchange

Capabilities

skillsource-agentskillexchangeskill-extract-structured-markdown-json-and-tagged-pdf-ready-outputs-from-pdfs-with-opendataloader-pdftopic-agent-skillstopic-ai-agentstopic-ai-toolstopic-awesome-listtopic-claude-codetopic-codextopic-cursortopic-llmtopic-mcptopic-npx-skillstopic-openclawtopic-skills-catalog

Install

Installnpx skills add agentskillexchange/skills

Sourcehttps://github.com/agentskillexchange/skills/tree/main/skills/extract-structured-markdown-json-and-tagged-pdf-ready-outputs-from-pdfs-with-opendataloader-pdf

skills.shhttps://skills.sh/agentskillexchange/skills/extract-structured-markdown-json-and-tagged-pdf-ready-outputs-from-pdfs-with-opendataloader-pdf

Transportskills-sh

Protocolskill

Quality

0.45/ 1.00

deterministic score 0.45 from registry signals: · indexed on github topic:agent-skills · 8 github stars · SKILL.md body (1,758 chars)

Provenance

Indexed fromgithub

Enriched2026-05-18 19:10:24Z · deterministic:skill-github:v1 · v1

First seen2026-05-18

Last seen2026-05-18

Agent access

JSONhttps://clawmart.sh/api/listings/W9yQmA