Extract structured text, metadata, tables, and images from mixed documents through an MCP server with Kreuzberg
Expose one document-extraction surface to MCP-compatible agents so they can normalize PDFs, Office files, images, HTML, and other mixed inputs before downstream review or indexing.
What it does
Extract structured text, metadata, tables, and images from mixed documents through an MCP server with Kreuzberg
Expose one document-extraction surface to MCP-compatible agents so they can normalize PDFs, Office files, images, HTML, and other mixed inputs before downstream review or indexing.
Prerequisites
Kreuzberg install or container image, document files to process, MCP-compatible client
Installation
Use the upstream install or setup path that matches your environment:
- npx skills add kreuzberg-dev/kreuzberg
Requirements and caveats from upstream:
- <img src="https://img.shields.io/pypi/v/kreuzberg?label=Python&color=007ec6" alt="Python">
- <a href="https://www.npmjs.com/package/@kreuzberg/node">
- <img src="https://img.shields.io/npm/v/@kreuzberg/node?label=Node.js&color=007ec6" alt="Node.js">
Basic usage or getting-started notes:
-
Each language binding provides comprehensive documentation with examples and best practices. Choose your platform to get started:
-
Scripting Languages:
-
Ruby – RubyGems package, idiomatic Ruby API, native bindings
-
Extracted from upstream docs: https://raw.githubusercontent.com/kreuzberg-dev/kreuzberg/HEAD/README.md
Documentation
Source
Capabilities
Install
Quality
deterministic score 0.45 from registry signals: · indexed on github topic:agent-skills · 8 github stars · SKILL.md body (1,574 chars)