Skillquality 0.45
Apache Tika Content Extraction Hub
Extracts text and metadata from 1400+ file formats via Apache Tika Server REST API. Handles PDF, DOCX, PPTX, email archives, and embedded document extraction with MIME type detection.
Price
free
Protocol
skill
Verified
no
What it does
Apache Tika Content Extraction Hub
Extracts text and metadata from 1400+ file formats via Apache Tika Server REST API. Handles PDF, DOCX, PPTX, email archives, and embedded document extraction with MIME type detection.
Installation
Requirements and caveats from upstream:
- N.B. Docker is used for tests in tika-integration-tests. If Docker is not installed, those tests are skipped.
Basic usage or getting-started notes:
-
===========
-
Parse a file in Java:
-
java
-
Source: https://github.com/apache/tika
-
Extracted from upstream docs: https://raw.githubusercontent.com/apache/tika/HEAD/README.md
Source
Capabilities
skillsource-agentskillexchangeskill-apache-tika-content-extraction-hubtopic-agent-skillstopic-ai-agentstopic-ai-toolstopic-awesome-listtopic-claude-codetopic-codextopic-cursortopic-llmtopic-mcptopic-npx-skillstopic-openclawtopic-skills-catalog
Install
Installnpx skills add agentskillexchange/skills
Sourcehttps://github.com/agentskillexchange/skills/tree/main/skills/apache-tika-content-extraction-hub
Transportskills-sh
Protocolskill
Quality
0.45/ 1.00
deterministic score 0.45 from registry signals: · indexed on github topic:agent-skills · 8 github stars · SKILL.md body (774 chars)
Provenance
Indexed fromgithub
Enriched2026-05-18 19:09:23Z · deterministic:skill-github:v1 · v1
First seen2026-05-18
Last seen2026-05-18