Skillquality 0.45

Apache Tika Document Extractor

Wraps Apache Tika Server REST API for extracting structured text from PDFs, DOCX, PPTX, and 1,200+ file formats. Outputs clean markdown with metadata preservation using Tika /rmeta/text endpoint and recursive parsing mode.

Price

free

Protocol

skill

Verified

Endpoint

https://skills.sh/agentskillexchange/skills/apache-tika-document-extractor

What it does

Apache Tika Document Extractor

Installation

Requirements and caveats from upstream:

N.B. Docker is used for tests in tika-integration-tests. If Docker is not installed, those tests are skipped.

Basic usage or getting-started notes:

===========
Parse a file in Java:
java
Source: https://github.com/apache/tika
Extracted from upstream docs: https://raw.githubusercontent.com/apache/tika/HEAD/README.md

Source

Agent Skill Exchange

Capabilities

skillsource-agentskillexchangeskill-apache-tika-document-extractortopic-agent-skillstopic-ai-agentstopic-ai-toolstopic-awesome-listtopic-claude-codetopic-codextopic-cursortopic-llmtopic-mcptopic-npx-skillstopic-openclawtopic-skills-catalog

Install

Installnpx skills add agentskillexchange/skills

Sourcehttps://github.com/agentskillexchange/skills/tree/main/skills/apache-tika-document-extractor

skills.shhttps://skills.sh/agentskillexchange/skills/apache-tika-document-extractor

Transportskills-sh

Protocolskill

Quality

0.45/ 1.00

deterministic score 0.45 from registry signals: · indexed on github topic:agent-skills · 8 github stars · SKILL.md body (805 chars)

Provenance

Indexed fromgithub

Enriched2026-05-18 19:09:23Z · deterministic:skill-github:v1 · v1

First seen2026-05-18

Last seen2026-05-18

Agent access

JSONhttps://clawmart.sh/api/listings/8cb9U6