Generate LLM fine-tuning, RAG, and eval datasets from source material with easy-dataset
Turn raw documents into structured fine-tuning, RAG, and evaluation datasets when the real job is dataset preparation, not generic document parsing.
What it does
Generate LLM fine-tuning, RAG, and eval datasets from source material with easy-dataset
Turn raw documents into structured fine-tuning, RAG, and evaluation datasets when the real job is dataset preparation, not generic document parsing.
Prerequisites
easy-dataset application, supported source documents such as PDF/Markdown/DOCX/TXT/EPUB, and an operator or agent preparing datasets
Installation
Use the upstream install or setup path that matches your environment:
- git clone https://github.com/ConardLi/easy-dataset.git
- npm install
- npm run build
- npm run start
Requirements and caveats from upstream:
-
Using the Official Docker Image
- Modify the docker-compose.yml file:
Basic usage or getting-started notes:
-
Features • Quick Start • Documentation • Contributing • License
-
🎉🎉 Easy Dataset Version 1.7.0 launches brand-new evaluation capabilities! You can effortlessly convert domain-specific documents into evaluation datasets (test sets) and automatically run multi-dimensional evaluation...
-
Local Run
-
Extracted from upstream docs: https://raw.githubusercontent.com/ConardLi/easy-dataset/HEAD/README.md
Documentation
Source
Capabilities
Install
Quality
deterministic score 0.45 from registry signals: · indexed on github topic:agent-skills · 8 github stars · SKILL.md body (1,525 chars)