Extract invoice fields from vendor PDFs into structured records
Uses invoice2data to turn invoice PDFs into structured JSON, CSV, or XML using supplier-specific templates. This is for repeatable invoice field extraction and renaming workflows, not for full accounting system automation or generic OCR catalog listings.
What it does
Extract invoice fields from vendor PDFs into structured records
Uses invoice2data to turn invoice PDFs into structured JSON, CSV, or XML using supplier-specific templates. This is for repeatable invoice field extraction and renaming workflows, not for full accounting system automation or generic OCR catalog listings.
Prerequisites
pdftotext or pdfminer or pdfplumber or OCRmyPDF or Tesseract or Google Cloud Vision
Installation
Requirements and caveats from upstream:
- A command line tool and Python library that automates the extraction of key information from invoices to support your accounting
Basic usage or getting-started notes:
-
Basic usage. Process PDF files and write result to CSV.
-
Please see the [Command-line Reference] for details.
-
invoice2data invoice.pdf
-
Extracted from upstream docs: https://raw.githubusercontent.com/invoice-x/invoice2data/HEAD/README.md
Documentation
Source
Capabilities
Install
Quality
deterministic score 0.45 from registry signals: · indexed on github topic:agent-skills · 8 github stars · SKILL.md body (1,385 chars)