Extract clean article Markdown from web pages with Defuddle
Use Defuddle when an agent needs clean, metadata-rich article text or Markdown from noisy web pages before summarizing, indexing, or archiving them.
What it does
Extract clean article Markdown from web pages with Defuddle
Use Defuddle when an agent needs clean, metadata-rich article text or Markdown from noisy web pages before summarizing, indexing, or archiving them.
Prerequisites
Node.js, npx or npm, defuddle CLI
Installation
Use the upstream install or setup path that matches your environment:
- npx defuddle parse page.html
- npx defuddle parse https://example.com/article
- npx defuddle parse page.html --markdown
- npx defuddle parse page.html --json
Requirements and caveats from upstream:
-
Node.js
- defuddle/node accepts a DOM Document from any implementation (JSDOM, linkedom, happy-dom, etc.).
- import { Defuddle } from 'defuddle/node';
Basic usage or getting-started notes:
-
Defuddle takes a URL or HTML, finds the main content, and returns cleaned HTML or Markdown. Defuddle was created for the browser extension Obsidian Web Clipper, but it...
-
Browser
-
javascript
-
Extracted from upstream docs: https://raw.githubusercontent.com/kepano/defuddle/HEAD/README.md
Documentation
Source
Capabilities
Install
Quality
deterministic score 0.45 from registry signals: · indexed on github topic:agent-skills · 8 github stars · SKILL.md body (1,320 chars)