LangExtract LLM-Powered Structured Text Extraction
LangExtract by Google is a Python library for extracting structured information from unstructured text using LLMs with precise source grounding. With 35,000+ GitHub stars, it handles everything from clinical notes to literary analysis, producing verified extraction results with e
What it does
LangExtract LLM-Powered Structured Text Extraction
LangExtract by Google is a Python library for extracting structured information from unstructured text using LLMs with precise source grounding. With 35,000+ GitHub stars, it handles everything from clinical notes to literary analysis, producing verified extraction results with exact source text mappings and interactive visualizations.
Installation
Use the upstream install or setup path that matches your environment:
- pip install langextract
- git clone https://github.com/google/langextract.git
- pip install -e .
- pip install -e ".[dev]"
Requirements and caveats from upstream:
- LangExtract is a Python library that uses LLMs to extract structured information from unstructured text documents based on user-defined instructions. It processes materials such as clinical notes or reports, identifyi...
- Note: Using cloud-hosted models like Gemini requires an API key. See the API Key Setup section for instructions on how to get and configure your key.
- python
Basic usage or getting-started notes:
-
Extract structured information with just a few lines of code.
-
1. Define Your Extraction Task
-
Extracted from upstream docs: https://raw.githubusercontent.com/google/langextract/HEAD/README.md
Source
Capabilities
Install
Quality
deterministic score 0.45 from registry signals: · indexed on github topic:agent-skills · 8 github stars · SKILL.md body (1,507 chars)