{"id":"188198c0-803b-4ff3-b5bc-eea3ea953833","shortId":"WmFqxa","kind":"skill","title":"mcp-local-rag","tagline":"Search, ingest, expand chunk context, or manage local documents via a local RAG MCP server (tools: query_documents, read_chunk_neighbors, ingest_file, ingest_data, delete_file, list_files). Use when user says \"search my docs\", \"save this page\", \"read around that chunk\", \"what did","description":"# MCP Local RAG Skills\n\n## Tools\n\n| MCP Tool | CLI Equivalent | Use When |\n|----------|---------------|----------|\n| `ingest_file` | `npx mcp-local-rag ingest <path>` | Local files (PDF, DOCX, TXT, MD). CLI for bulk/directory. |\n| `ingest_data` | — | Raw content (HTML, text) with source URL |\n| `query_documents` | `npx mcp-local-rag query <text>` | Semantic + keyword hybrid search |\n| `delete_file` | `npx mcp-local-rag delete <path>` | Remove ingested content |\n| `list_files` | `npx mcp-local-rag list` | File ingestion status |\n| `status` | `npx mcp-local-rag status` | Database stats |\n| `read_chunk_neighbors` | `npx mcp-local-rag read-neighbors` | Read N chunks adjacent to a known chunkIndex (context expansion; call after `query_documents` or grep) |\n\n## Search: Core Rules\n\nHybrid search combines vector (semantic) and keyword (BM25).\n\n### Score Interpretation\n\nLower = better match. Use this to filter noise.\n\n| Score | Action |\n|-------|--------|\n| < 0.3 | Use directly |\n| 0.3-0.5 | Include if mentions same concept/entity |\n| 0.5-0.7 | Include only if directly relevant to the question |\n| > 0.7 | Skip unless no better results |\n\n### Limit Selection\n\n| Intent | Limit |\n|--------|-------|\n| Specific answer (function, error) | 5 |\n| General understanding | 10 |\n| Comprehensive survey | 20 |\n\n### Query Formulation\n\n| Situation | Why Transform | Action |\n|-----------|---------------|--------|\n| Specific term mentioned | Keyword search needs exact match | KEEP term |\n| Vague query | Vector search needs semantic signal | ADD context |\n| Error stack or code block | Long text dilutes relevance | EXTRACT core keywords |\n| Multiple distinct topics | Single query conflates results | SPLIT queries |\n| Few/poor results | Term mismatch | EXPAND (see below) |\n\n### Query Expansion\n\nWhen results are few or all score > 0.5, expand query terms:\n\n- Keep original term first, add 2-4 variants\n- Types: synonyms, abbreviations, related terms, word forms\n- Example: `\"config\"` → `\"config configuration settings configure\"`\n\nAvoid over-expansion (causes topic drift).\n\n### Result Selection\n\nWhen to include vs skip—based on answer quality, not just score.\n\n**INCLUDE** if:\n- Directly answers the question\n- Provides necessary context\n- Score < 0.5\n\n**SKIP** if:\n- Same keyword, unrelated context\n- Score > 0.7\n- Mentions term without explanation\n\n### fileTitle\n\nEach result includes `fileTitle` (document title extracted from content). Null when extraction fails.\n\n| Use | How |\n|-----|-----|\n| Disambiguate chunks | Use fileTitle to identify which document the chunk belongs to |\n| Group related chunks | Same fileTitle = same document context |\n| Deprioritize mismatches | fileTitle unrelated to query AND score > 0.5 → rank lower |\n\n## Context Expansion (read_chunk_neighbors)\n\n`read_chunk_neighbors` (CLI: `read-neighbors`) is an **on-demand context expansion utility**, not a routine follow-up to every `query_documents` call. Chunks in this index are **semantic units** — sentences or paragraphs grouped by topic via Max-Min semantic chunking, not fixed-size text slices. Reading the chunks immediately before and after a target chunk yields coherent surrounding context, not arbitrary fragments.\n\nEach `query_documents` result item already includes `filePath` and `chunkIndex`. Pass those to `read_chunk_neighbors` to expand a specific hit in place.\n\nTrigger this tool only when one of these signals is present:\n- **Insufficient context for your answer**: during response generation, the target chunk alone is not enough to reach a grounded conclusion (e.g., it references \"this approach\" or \"as shown above\" without the referent).\n- **Explicit user request for more context**: the user asks for surrounding detail (\"what comes before that?\", \"read more around that section\", \"show me the full explanation\").\n\nIf neither signal is present, stop at the `query_documents` results.\n\nTypical workflow when triggered:\n1. Identify the specific chunk to expand (from a prior `query_documents` hit or `grep`).\n2. Take that chunk's `filePath` and `chunkIndex`.\n3. Call `read_chunk_neighbors` with those values; the response contains the target chunk plus its semantic neighbors, sorted by `chunkIndex`.\n\nSee [cli-reference.md](references/cli-reference.md#read-neighbors) for output fields and an example.\n\n## Ingestion\n\n### ingest_file\n```\ningest_file({ filePath: \"/absolute/path/to/document.pdf\" })\n```\n\n### ingest_data\n```\ningest_data({\n  content: \"<html>...</html>\",\n  metadata: { source: \"https://example.com/page\", format: \"html\" }\n})\n```\n\n**Format selection** — match the data you have:\n- HTML string → `format: \"html\"`\n- Markdown string → `format: \"markdown\"`\n- Other → `format: \"text\"`\n\n**Source format:**\n- Web page → Use URL: `https://example.com/page`\n- Other content → Use scheme: `{type}://{date}` or `{type}://{date}/{detail}` where `{type}` is a short identifier for the content origin (e.g., clipboard, chat, note, meeting)\n\n**HTML source options:**\n- Static page → HTTP fetch\n- SPA/JS-rendered → Browser/web tool with DOM rendering\n- Auth required → Manual paste\n\nIf HTTP fetch returns empty or minimal content, retry with a browser/web tool.\n\nSource URLs are normalized: query strings and fragments are stripped. See [html-ingestion.md](references/html-ingestion.md) for cases where this matters.\n\nRe-ingest same source to update. Use same source in `delete_file` to remove.\n\n### CLI commands\n\nCLI subcommands mirror MCP tools. Useful for bulk operations, scripting, and environments without MCP.\n\n- `query`, `list`, `status`, `delete` output JSON to stdout\n- `ingest` outputs progress to stderr\n- Use `--help` on any command for options\n- See [cli-reference.md](references/cli-reference.md) for options and config matching\n\n## References\n\nFor edge cases and examples:\n- [html-ingestion.md](references/html-ingestion.md) - URL normalization, SPA handling\n- [query-optimization.md](references/query-optimization.md) - Query patterns by intent\n- [result-refinement.md](references/result-refinement.md) - Synthesis vs filter strategy, contradiction resolution, chunking\n- [cli-reference.md](references/cli-reference.md) - CLI command options, config matching, output conventions","tags":["mcp","local","rag","shinpr","agent-skills","developer-tools","hybrid-search","local-first","local-rag","mcp-server","privacy-first","semantic-search"],"capabilities":["skill","source-shinpr","skill-mcp-local-rag","topic-agent-skills","topic-developer-tools","topic-hybrid-search","topic-local-first","topic-local-rag","topic-mcp","topic-mcp-server","topic-privacy-first","topic-rag","topic-semantic-search","topic-skills","topic-vector-search"],"categories":["mcp-local-rag"],"synonyms":[],"warnings":[],"endpointUrl":"https://skills.sh/shinpr/mcp-local-rag/mcp-local-rag","protocol":"skill","transport":"skills-sh","auth":{"type":"none","details":{"cli":"npx skills add shinpr/mcp-local-rag","source_repo":"https://github.com/shinpr/mcp-local-rag","install_from":"skills.sh"}},"qualityScore":"0.571","qualityRationale":"deterministic score 0.57 from registry signals: · indexed on github topic:agent-skills · 242 github stars · SKILL.md body (6,198 chars)","verified":false,"liveness":"unknown","lastLivenessCheck":null,"agentReviews":{"count":0,"score_avg":null,"cost_usd_avg":null,"success_rate":null,"latency_p50_ms":null,"narrative_summary":null,"summary_updated_at":null},"enrichmentModel":"deterministic:skill-github:v1","enrichmentVersion":1,"enrichedAt":"2026-05-02T18:54:12.431Z","embedding":null,"createdAt":"2026-04-18T22:04:53.399Z","updatedAt":"2026-05-02T18:54:12.431Z","lastSeenAt":"2026-05-02T18:54:12.431Z","tsv":"'-0.5':184 '-0.7':191 '-4':293 '/absolute/path/to/document.pdf':641 '/page':651,680 '0.3':180,183 '0.5':190,283,339,396 '0.7':200,347 '1':579 '10':217 '2':292,594 '20':220 '3':602 '5':214 'abbrevi':297 'action':179,226 'add':244,291 'adjac':144 'alon':517 'alreadi':477 'answer':211,324,332,510 'approach':530 'arbitrari':470 'around':45,556 'ask':546 'auth':719 'avoid':308 'base':322 'belong':378 'better':171,204 'block':250 'bm25':167 'browser/web':714,734 'bulk':778 'bulk/directory':77 'call':151,429,603 'case':750,816 'caus':312 'chat':703 'chunk':8,24,47,131,143,369,377,382,402,405,430,448,457,464,486,516,583,597,605,615,839 'chunkindex':148,481,601,622 'cli':57,75,407,769,771,842 'cli-reference.md':624,806,840 'clipboard':702 'code':249 'coher':466 'combin':162 'come':551 'command':770,802,843 'comprehens':218 'concept/entity':189 'conclus':525 'config':303,304,811,845 'configur':305,307 'conflat':263 'contain':612 'content':81,109,361,646,682,699,730 'context':9,149,245,337,345,387,399,416,468,507,543 'contradict':837 'convent':848 'core':158,256 'data':29,79,643,645,658 'databas':128 'date':686,689 'delet':30,99,106,765,788 'demand':415 'depriorit':388 'detail':549,690 'dilut':253 'direct':182,195,331 'disambigu':368 'distinct':259 'doc':40 'document':13,22,88,154,357,375,386,428,474,573,590 'docx':72 'dom':717 'drift':314 'e.g':526,701 'edg':815 'empti':727 'enough':520 'environ':782 'equival':58 'error':213,246 'everi':426 'exact':233 'exampl':302,634,818 'example.com':650,679 'example.com/page':649,678 'expand':7,271,284,489,585 'expans':150,275,311,400,417 'explan':351,563 'explicit':538 'extract':255,359,364 'fail':365 'fetch':712,725 'few/poor':267 'field':631 'file':27,31,33,62,70,100,111,118,637,639,766 'filepath':479,599,640 'filetitl':352,356,371,384,390 'filter':176,835 'first':290 'fix':451 'fixed-s':450 'follow':423 'follow-up':422 'form':301 'format':652,654,663,667,670,673 'formul':222 'fragment':471,743 'full':562 'function':212 'general':215 'generat':513 'grep':156,593 'ground':524 'group':380,440 'handl':824 'help':799 'hit':492,591 'html':82,653,661,664,706 'html-ingestion.md':747,819 'http':711,724 'hybrid':97,160 'identifi':373,580,696 'immedi':458 'includ':185,192,319,329,355,478 'index':433 'ingest':6,26,28,61,68,78,108,119,635,636,638,642,644,756,793 'insuffici':506 'intent':208,830 'interpret':169 'item':476 'json':790 'keep':235,287 'keyword':96,166,230,257,343 'known':147 'limit':206,209 'list':32,110,117,786 'local':3,12,16,51,66,69,92,104,115,125,136 'long':251 'lower':170,398 'manag':11 'manual':721 'markdown':665,668 'match':172,234,656,812,846 'matter':753 'max':445 'max-min':444 'mcp':2,18,50,55,65,91,103,114,124,135,774,784 'mcp-local-rag':1,64,90,102,113,123,134 'md':74 'meet':705 'mention':187,229,348 'metadata':647 'min':446 'minim':729 'mirror':773 'mismatch':270,389 'multipl':258 'n':142 'necessari':336 'need':232,241 'neighbor':25,132,140,403,406,410,487,606,619,628 'neither':565 'nois':177 'normal':739,822 'note':704 'npx':63,89,101,112,122,133 'null':362 'on-demand':413 'one':500 'oper':779 'option':708,804,809,844 'origin':288,700 'output':630,789,794,847 'over-expans':309 'page':43,675,710 'paragraph':439 'pass':482 'past':722 'pattern':828 'pdf':71 'place':494 'plus':616 'present':505,568 'prior':588 'progress':795 'provid':335 'qualiti':325 'queri':21,87,94,153,221,238,262,266,274,285,393,427,473,572,589,740,785,827 'query-optimization.md':825 'question':199,334 'rag':4,17,52,67,93,105,116,126,137 'rank':397 'raw':80 're':755 're-ingest':754 'reach':522 'read':23,44,130,139,141,401,404,409,455,485,554,604,627 'read-neighbor':138,408,626 'refer':528,537,813 'references/cli-reference.md':625,807,841 'references/html-ingestion.md':748,820 'references/query-optimization.md':826 'references/result-refinement.md':832 'relat':298,381 'relev':196,254 'remov':107,768 'render':718 'request':540 'requir':720 'resolut':838 'respons':512,611 'result':205,264,268,277,315,354,475,574 'result-refinement.md':831 'retri':731 'return':726 'routin':421 'rule':159 'save':41 'say':37 'scheme':684 'score':168,178,282,328,338,346,395 'script':780 'search':5,38,98,157,161,231,240 'section':558 'see':272,623,746,805 'select':207,316,655 'semant':95,164,242,435,447,618 'sentenc':437 'server':19 'set':306 'short':695 'show':559 'shown':533 'signal':243,503,566 'singl':261 'situat':223 'size':452 'skill':53 'skill-mcp-local-rag' 'skip':201,321,340 'slice':454 'sort':620 'sourc':85,648,672,707,736,758,763 'source-shinpr' 'spa':823 'spa/js-rendered':713 'specif':210,227,491,582 'split':265 'stack':247 'stat':129 'static':709 'status':120,121,127,787 'stderr':797 'stdout':792 'stop':569 'strategi':836 'string':662,666,741 'strip':745 'subcommand':772 'surround':467,548 'survey':219 'synonym':296 'synthesi':833 'take':595 'target':463,515,614 'term':228,236,269,286,289,299,349 'text':83,252,453,671 'titl':358 'tool':20,54,56,497,715,735,775 'topic':260,313,442 'topic-agent-skills' 'topic-developer-tools' 'topic-hybrid-search' 'topic-local-first' 'topic-local-rag' 'topic-mcp' 'topic-mcp-server' 'topic-privacy-first' 'topic-rag' 'topic-semantic-search' 'topic-skills' 'topic-vector-search' 'transform':225 'trigger':495,578 'txt':73 'type':295,685,688,692 'typic':575 'understand':216 'unit':436 'unless':202 'unrel':344,391 'updat':760 'url':86,677,737,821 'use':34,59,173,181,366,370,676,683,761,776,798 'user':36,539,545 'util':418 'vagu':237 'valu':609 'variant':294 'vector':163,239 'via':14,443 'vs':320,834 'web':674 'without':350,535,783 'word':300 'workflow':576 'yield':465","prices":[{"id":"8364b669-b378-4778-bdcb-0c0bab7b3bbf","listingId":"188198c0-803b-4ff3-b5bc-eea3ea953833","amountUsd":"0","unit":"free","nativeCurrency":null,"nativeAmount":null,"chain":null,"payTo":null,"paymentMethod":"skill-free","isPrimary":true,"details":{"org":"shinpr","category":"mcp-local-rag","install_from":"skills.sh"},"createdAt":"2026-04-18T22:04:53.399Z"}],"sources":[{"listingId":"188198c0-803b-4ff3-b5bc-eea3ea953833","source":"github","sourceId":"shinpr/mcp-local-rag/mcp-local-rag","sourceUrl":"https://github.com/shinpr/mcp-local-rag/tree/main/skills/mcp-local-rag","isPrimary":false,"firstSeenAt":"2026-04-18T22:04:53.399Z","lastSeenAt":"2026-05-02T18:54:12.431Z"}],"details":{"listingId":"188198c0-803b-4ff3-b5bc-eea3ea953833","quickStartSnippet":null,"exampleRequest":null,"exampleResponse":null,"schema":null,"openapiUrl":null,"agentsTxtUrl":null,"citations":[],"useCases":[],"bestFor":[],"notFor":[],"kindDetails":{"org":"shinpr","slug":"mcp-local-rag","github":{"repo":"shinpr/mcp-local-rag","stars":242,"topics":["agent-skills","developer-tools","hybrid-search","local-first","local-rag","mcp","mcp-server","privacy-first","rag","semantic-search","skills","vector-search"],"license":"mit","html_url":"https://github.com/shinpr/mcp-local-rag","pushed_at":"2026-04-23T09:49:30Z","description":"Local-first RAG server for developers. Semantic + keyword search for code and technical docs. Works with MCP or CLI. Fully private, zero setup.","skill_md_sha":"80e3e81153c424729f26ec4be07ebe00c7816b3c","skill_md_path":"skills/mcp-local-rag/SKILL.md","default_branch":"main","skill_tree_url":"https://github.com/shinpr/mcp-local-rag/tree/main/skills/mcp-local-rag"},"layout":"multi","source":"github","category":"mcp-local-rag","frontmatter":{"name":"mcp-local-rag","description":"Search, ingest, expand chunk context, or manage local documents via a local RAG MCP server (tools: query_documents, read_chunk_neighbors, ingest_file, ingest_data, delete_file, list_files). Use when user says \"search my docs\", \"save this page\", \"read around that chunk\", \"what did I save about X\", or invokes `npx mcp-local-rag`."},"skills_sh_url":"https://skills.sh/shinpr/mcp-local-rag/mcp-local-rag"},"updatedAt":"2026-05-02T18:54:12.431Z"}}