Skillquality 0.45

Trafilatura Web Text Extraction and Crawling Toolkit

Trafilatura is a Python package and CLI tool for gathering text from the web. It handles crawling, downloading, and extracting main text content, metadata, and comments from raw HTML, outputting clean structured data in CSV, JSON, Markdown, XML, and TXT formats.

Price

free

Protocol

skill

Verified

Endpoint

https://skills.sh/agentskillexchange/skills/trafilatura-web-text-extraction-crawling

What it does

Trafilatura Web Text Extraction and Crawling Toolkit

Installation

Requirements and caveats from upstream:

Trafilatura is a cutting-edge Python package and command-line tool

Basic usage or getting-started notes:

to run the evaluation with the latest data and packages.
Getting started with Trafilatura
is straightforward. For more information and detailed guides, visit
Source: https://github.com/adbar/trafilatura
Extracted from upstream docs: https://raw.githubusercontent.com/adbar/trafilatura/HEAD/README.md

Source

Agent Skill Exchange

Capabilities

skillsource-agentskillexchangeskill-trafilatura-web-text-extraction-crawlingtopic-agent-skillstopic-ai-agentstopic-ai-toolstopic-awesome-listtopic-claude-codetopic-codextopic-cursortopic-llmtopic-mcptopic-npx-skillstopic-openclawtopic-skills-catalog

Install

Installnpx skills add agentskillexchange/skills

Sourcehttps://github.com/agentskillexchange/skills/tree/main/skills/trafilatura-web-text-extraction-crawling

skills.shhttps://skills.sh/agentskillexchange/skills/trafilatura-web-text-extraction-crawling

Transportskills-sh

Protocolskill

Quality

0.45/ 1.00

deterministic score 0.45 from registry signals: · indexed on github topic:agent-skills · 8 github stars · SKILL.md body (1,213 chars)

Provenance

Indexed fromgithub

Enriched2026-05-18 19:12:53Z · deterministic:skill-github:v1 · v1

First seen2026-05-18

Last seen2026-05-18

Agent access

JSONhttps://clawmart.sh/api/listings/nT5C4g