{"id":"cd55b6a6-879a-440e-b31e-8a526b5f13bd","shortId":"cVC8eT","kind":"skill","title":"rag-engineer","tagline":"Expert in building Retrieval-Augmented Generation systems. Masters","description":"# RAG Engineer\n\nExpert in building Retrieval-Augmented Generation systems. Masters embedding models,\nvector databases, chunking strategies, and retrieval optimization for LLM applications.\n\n**Role**: RAG Systems Architect\n\nI bridge the gap between raw documents and LLM understanding. I know that\nretrieval quality determines generation quality - garbage in, garbage out.\nI obsess over chunking boundaries, embedding dimensions, and similarity\nmetrics because they make the difference between helpful and hallucinating.\n\n### Expertise\n\n- Embedding model selection and fine-tuning\n- Vector database architecture and scaling\n- Chunking strategies for different content types\n- Retrieval quality optimization\n- Hybrid search implementation\n- Re-ranking and filtering strategies\n- Context window management\n- Evaluation metrics for retrieval\n\n### Principles\n\n- Retrieval quality > Generation quality - fix retrieval first\n- Chunk size depends on content type and query patterns\n- Embeddings are not magic - they have blind spots\n- Always evaluate retrieval separately from generation\n- Hybrid search beats pure semantic in most cases\n\n## Capabilities\n\n- Vector embeddings and similarity search\n- Document chunking and preprocessing\n- Retrieval pipeline design\n- Semantic search implementation\n- Context window optimization\n- Hybrid search (keyword + semantic)\n\n## Prerequisites\n\n- Required skills: LLM fundamentals, Understanding of embeddings, Basic NLP concepts\n\n## Patterns\n\n### Semantic Chunking\n\nChunk by meaning, not arbitrary token counts\n\n**When to use**: Processing documents with natural sections\n\n- Use sentence boundaries, not token limits\n- Detect topic shifts with embedding similarity\n- Preserve document structure (headers, paragraphs)\n- Include overlap for context continuity\n- Add metadata for filtering\n\n### Hierarchical Retrieval\n\nMulti-level retrieval for better precision\n\n**When to use**: Large document collections with varied granularity\n\n- Index at multiple chunk sizes (paragraph, section, document)\n- First pass: coarse retrieval for candidates\n- Second pass: fine-grained retrieval for precision\n- Use parent-child relationships for context\n\n### Hybrid Search\n\nCombine semantic and keyword search\n\n**When to use**: Queries may be keyword-heavy or semantic\n\n- BM25/TF-IDF for keyword matching\n- Vector similarity for semantic matching\n- Reciprocal Rank Fusion for combining scores\n- Weight tuning based on query type\n\n### Query Expansion\n\nExpand queries to improve recall\n\n**When to use**: User queries are short or ambiguous\n\n- Use LLM to generate query variations\n- Add synonyms and related terms\n- Hypothetical Document Embedding (HyDE)\n- Multi-query retrieval with deduplication\n\n### Contextual Compression\n\nCompress retrieved context to fit window\n\n**When to use**: Retrieved chunks exceed context limits\n\n- Extract relevant sentences only\n- Use LLM to summarize chunks\n- Remove redundant information\n- Prioritize by relevance score\n\n### Metadata Filtering\n\nPre-filter by metadata before semantic search\n\n**When to use**: Documents have structured metadata\n\n- Filter by date, source, category first\n- Reduce search space before vector similarity\n- Combine metadata filters with semantic scores\n- Index metadata for fast filtering\n\n## Sharp Edges\n\n### Fixed-size chunking breaks sentences and context\n\nSeverity: HIGH\n\nSituation: Using fixed token/character limits for chunking\n\nSymptoms:\n- Retrieved chunks feel incomplete or cut off\n- Answer quality varies wildly\n- High recall but low precision\n\nWhy this breaks:\nFixed-size chunks split mid-sentence, mid-paragraph, or mid-idea.\nThe resulting embeddings represent incomplete thoughts, leading to\npoor retrieval quality. Users search for concepts but get fragments.\n\nRecommended fix:\n\nUse semantic chunking that respects document structure:\n- Split on sentence/paragraph boundaries\n- Use embedding similarity to detect topic shifts\n- Include overlap for context continuity\n- Preserve headers and document structure as metadata\n\n### Pure semantic search without metadata pre-filtering\n\nSeverity: MEDIUM\n\nSituation: Only using vector similarity, ignoring metadata\n\nSymptoms:\n- Returns outdated information\n- Mixes content from wrong sources\n- Users can't scope their searches\n\nWhy this breaks:\nSemantic search finds semantically similar content, but not necessarily\nrelevant content. Without metadata filtering, you return old docs when\nuser wants recent, wrong categories, or inapplicable content.\n\nRecommended fix:\n\nImplement hybrid filtering:\n- Pre-filter by metadata (date, source, category) before vector search\n- Post-filter results by relevance criteria\n- Include metadata in the retrieval API\n- Allow users to specify filters\n\n### Using same embedding model for different content types\n\nSeverity: MEDIUM\n\nSituation: One embedding model for code, docs, and structured data\n\nSymptoms:\n- Code search returns irrelevant results\n- Domain terms not matched properly\n- Similar concepts not clustered\n\nWhy this breaks:\nEmbedding models are trained on specific content types. Using a text\nembedding model for code, or a general model for domain-specific\ncontent, produces poor similarity matches.\n\nRecommended fix:\n\nEvaluate embeddings per content type:\n- Use code-specific embeddings for code (e.g., CodeBERT)\n- Consider domain-specific or fine-tuned embeddings\n- Benchmark retrieval quality before choosing\n- Separate indices for different content types if needed\n\n### Using first-stage retrieval results directly\n\nSeverity: MEDIUM\n\nSituation: Taking top-K from vector search without reranking\n\nSymptoms:\n- Clearly relevant docs not in top results\n- Results order seems arbitrary\n- Adding more results helps quality\n\nWhy this breaks:\nFirst-stage retrieval (vector search) optimizes for recall, not precision.\nThe top results by embedding similarity may not be the most relevant\nfor the specific query. Cross-encoder reranking dramatically improves\nprecision for the final results.\n\nRecommended fix:\n\nAdd reranking step:\n- Retrieve larger candidate set (e.g., top 20-50)\n- Rerank with cross-encoder (query-document pairs)\n- Return reranked top-K (e.g., top 5)\n- Cache reranker for performance\n\n### Cramming maximum context into LLM prompt\n\nSeverity: MEDIUM\n\nSituation: Using all retrieved context regardless of relevance\n\nSymptoms:\n- Answers drift with more context\n- LLM ignores key information\n- High token costs\n\nWhy this breaks:\nMore context isn't always better. Irrelevant context confuses the LLM,\nincreases latency and cost, and can cause the model to ignore the\nmost relevant information. Models have attention limits.\n\nRecommended fix:\n\nUse relevance thresholds:\n- Set minimum similarity score cutoff\n- Limit context to truly relevant chunks\n- Summarize or compress if needed\n- Order context by relevance\n\n### Not measuring retrieval quality separately from generation\n\nSeverity: HIGH\n\nSituation: Only evaluating end-to-end RAG quality\n\nSymptoms:\n- Can't diagnose poor RAG performance\n- Prompt changes don't help\n- Random quality variations\n\nWhy this breaks:\nIf answers are wrong, you can't tell if retrieval failed or generation\nfailed. This makes debugging impossible and leads to wrong fixes\n(tuning prompts when retrieval is the problem).\n\nRecommended fix:\n\nSeparate retrieval evaluation:\n- Create retrieval test set with relevant docs labeled\n- Measure MRR, NDCG, Recall@K for retrieval\n- Evaluate generation only on correct retrievals\n- Track metrics over time\n\n### Not updating embeddings when source documents change\n\nSeverity: MEDIUM\n\nSituation: Embeddings generated once, never refreshed\n\nSymptoms:\n- Returns outdated information\n- References deleted content\n- Inconsistent with source\n\nWhy this breaks:\nDocuments change but embeddings don't. Users retrieve outdated content\nor, worse, content that no longer exists. This erodes trust in the\nsystem.\n\nRecommended fix:\n\nImplement embedding refresh:\n- Track document versions/hashes\n- Re-embed on document change\n- Handle deleted documents\n- Consider TTL for embeddings\n\n### Same retrieval strategy for all query types\n\nSeverity: MEDIUM\n\nSituation: Using pure semantic search for keyword-heavy queries\n\nSymptoms:\n- Exact term searches miss results\n- Concept searches too literal\n- Users frustrated with both\n\nWhy this breaks:\nSome queries are keyword-oriented (looking for specific terms) while\nothers are semantic (looking for concepts). Pure semantic search fails\non exact matches; pure keyword search fails on paraphrases.\n\nRecommended fix:\n\nImplement hybrid search:\n- BM25/TF-IDF for keyword matching\n- Vector similarity for semantic matching\n- Reciprocal Rank Fusion to combine\n- Tune weights based on query patterns\n\n## Related Skills\n\nWorks well with: `ai-agents-architect`, `prompt-engineer`, `database-architect`, `backend`\n\n## When to Use\n- User mentions or implies: building RAG\n- User mentions or implies: vector search\n- User mentions or implies: embeddings\n- User mentions or implies: semantic search\n- User mentions or implies: document retrieval\n- User mentions or implies: context retrieval\n- User mentions or implies: knowledge base\n- User mentions or implies: LLM with documents\n- User mentions or implies: chunking strategy\n- User mentions or implies: pinecone\n- User mentions or implies: weaviate\n- User mentions or implies: chromadb\n- User mentions or implies: pgvector\n\n## Limitations\n- Use this skill only when the task clearly matches the scope described above.\n- Do not treat the output as a substitute for environment-specific validation, testing, or expert review.\n- Stop and ask for clarification if required inputs, permissions, safety boundaries, or success criteria are missing.","tags":["rag","engineer","antigravity","awesome","skills","sickn33","agent-skills","agentic-skills","ai-agent-skills","ai-agents","ai-coding","ai-workflows"],"capabilities":["skill","source-sickn33","skill-rag-engineer","topic-agent-skills","topic-agentic-skills","topic-ai-agent-skills","topic-ai-agents","topic-ai-coding","topic-ai-workflows","topic-antigravity","topic-antigravity-skills","topic-claude-code","topic-claude-code-skills","topic-codex-cli","topic-codex-skills"],"categories":["antigravity-awesome-skills"],"synonyms":[],"warnings":[],"endpointUrl":"https://skills.sh/sickn33/antigravity-awesome-skills/rag-engineer","protocol":"skill","transport":"skills-sh","auth":{"type":"none","details":{"cli":"npx skills add sickn33/antigravity-awesome-skills","source_repo":"https://github.com/sickn33/antigravity-awesome-skills","install_from":"skills.sh"}},"qualityScore":"0.700","qualityRationale":"deterministic score 0.70 from registry signals: · indexed on github topic:agent-skills · 34997 github stars · SKILL.md body (9,654 chars)","verified":false,"liveness":"unknown","lastLivenessCheck":null,"agentReviews":{"count":0,"score_avg":null,"cost_usd_avg":null,"success_rate":null,"latency_p50_ms":null,"narrative_summary":null,"summary_updated_at":null},"enrichmentModel":"deterministic:skill-github:v1","enrichmentVersion":1,"enrichedAt":"2026-04-25T06:51:48.827Z","embedding":null,"createdAt":"2026-04-18T20:38:45.476Z","updatedAt":"2026-04-25T06:51:48.827Z","lastSeenAt":"2026-04-25T06:51:48.827Z","tsv":"'-50':824 '20':823 '5':841 'ad':766 'add':232,344,814 'agent':1199 'ai':1198 'ai-agents-architect':1197 'allow':626 'alway':144,882 'ambigu':337 'answer':458,863,970 'api':625 'applic':35 'arbitrari':199,765 'architect':39,1200,1206 'architectur':91 'ask':1318 'attent':906 'augment':9,20 'backend':1207 'base':318,1188,1251 'basic':189 'beat':152 'benchmark':722 'better':243,883 'blind':142 'bm25/tf-idf':301,1172 'boundari':66,212,515,1326 'break':437,469,569,668,773,877,968,1056,1136 'bridg':41 'build':6,17,1215 'cach':842 'candid':267,819 'capabl':158 'case':157 'categori':412,593,609 'caus':895 'chang':959,1035,1058,1093 'child':279 'choos':726 'chromadb':1279 'chunk':28,65,94,127,165,194,195,257,371,383,436,449,452,473,507,923,1263 'clarif':1320 'clear':755,1293 'cluster':665 'coars':264 'code':646,652,683,706,710 'code-specif':705 'codebert':712 'collect':250 'combin':285,314,420,1185 'compress':360,361,926 'concept':191,499,663,1126,1153 'confus':886 'consid':713,1097 'content':98,131,557,575,580,596,637,675,692,702,731,1050,1066,1069 'context':112,174,230,282,363,373,440,526,848,858,867,879,885,919,930,1244 'contextu':359 'continu':231,527 'correct':1023 'cost':874,892 'count':201 'cram':846 'creat':1004 'criteria':619,1329 'cross':802,828 'cross-encod':801,827 'cut':456 'cutoff':917 'data':650 'databas':27,90,1205 'database-architect':1204 'date':410,607 'debug':985 'dedupl':358 'delet':1049,1095 'depend':129 'describ':1297 'design':170 'detect':216,520 'determin':55 'diagnos':954 'differ':76,97,636,730 'dimens':68 'direct':741 'doc':587,647,757,1010 'document':46,164,206,223,249,261,350,404,510,531,832,1034,1057,1086,1092,1096,1238,1258 'domain':657,690,715 'domain-specif':689,714 'dramat':805 'drift':864 'e.g':711,821,839 'edg':432 'emb':1090 'embed':24,67,82,136,160,188,220,351,487,517,633,643,669,680,700,708,721,789,1031,1039,1060,1083,1100,1227 'encod':803,829 'end':946,948 'end-to-end':945 'engin':3,14,1203 'environ':1309 'environment-specif':1308 'erod':1075 'evalu':115,145,699,944,1003,1019 'exact':1121,1159 'exceed':372 'exist':1073 'expand':324 'expans':323 'expert':4,15,1314 'expertis':81 'extract':375 'fail':979,982,1157,1164 'fast':429 'feel':453 'filter':110,235,392,395,408,422,430,542,583,601,604,615,630 'final':810 'find':572 'fine':87,271,719 'fine-grain':270 'fine-tun':86,718 'first':126,262,413,737,775 'first-stag':736,774 'fit':365 'fix':124,434,445,471,504,598,698,813,909,991,1000,1081,1168 'fixed-s':433,470 'fragment':502 'frustrat':1131 'fundament':185 'fusion':312,1183 'gap':43 'garbag':58,60 'general':686 'generat':10,21,56,122,149,341,939,981,1020,1040 'get':501 'grain':272 'granular':253 'hallucin':80 'handl':1094 'header':225,529 'heavi':298,1118 'help':78,769,962 'hierarch':236 'high':442,462,872,941 'hybrid':103,150,177,283,600,1170 'hyde':352 'hypothet':349 'idea':484 'ignor':550,869,899 'implement':105,173,599,1082,1169 'impli':1214,1220,1226,1231,1237,1243,1249,1255,1262,1268,1273,1278,1283 'imposs':986 'improv':327,806 'inapplic':595 'includ':227,523,620 'incomplet':454,489 'inconsist':1051 'increas':889 'index':254,426 'indic':728 'inform':386,555,871,903,1047 'input':1323 'irrelev':655,884 'isn':880 'k':748,838,1016 'key':870 'keyword':179,288,297,303,1117,1141,1162,1174 'keyword-heavi':296,1116 'keyword-ori':1140 'know':51 'knowledg':1250 'label':1011 'larg':248 'larger':818 'latenc':890 'lead':491,988 'level':240 'limit':215,374,447,907,918,1285 'liter':1129 'llm':34,48,184,339,380,850,868,888,1256 'longer':1072 'look':1143,1151 'low':465 'magic':139 'make':74,984 'manag':114 'master':12,23 'match':304,309,660,696,1160,1175,1180,1294 'maximum':847 'may':294,791 'mean':197 'measur':934,1012 'medium':544,640,743,853,1037,1109 'mention':1212,1218,1224,1229,1235,1241,1247,1253,1260,1266,1271,1276,1281 'metadata':233,391,397,407,421,427,534,539,551,582,606,621 'metric':71,116,1026 'mid':476,479,483 'mid-idea':482 'mid-paragraph':478 'mid-sent':475 'minimum':914 'miss':1124,1331 'mix':556 'model':25,83,634,644,670,681,687,897,904 'mrr':1013 'multi':239,354 'multi-level':238 'multi-queri':353 'multipl':256 'natur':208 'ndcg':1014 'necessarili':578 'need':734,928 'never':1042 'nlp':190 'obsess':63 'old':586 'one':642 'optim':32,102,176,780 'order':763,929 'orient':1142 'other':1148 'outdat':554,1046,1065 'output':1303 'overlap':228,524 'pair':833 'paragraph':226,259,480 'paraphras':1166 'parent':278 'parent-child':277 'pass':263,269 'pattern':135,192,1191 'per':701 'perform':845,957 'permiss':1324 'pgvector':1284 'pinecon':1269 'pipelin':169 'poor':493,694,955 'post':614 'post-filt':613 'pre':394,541,603 'pre-filt':393,540,602 'precis':244,275,466,784,807 'preprocess':167 'prerequisit':181 'preserv':222,528 'principl':119 'priorit':387 'problem':998 'process':205 'produc':693 'prompt':851,958,993,1202 'prompt-engin':1201 'proper':661 'pure':153,535,1112,1154,1161 'qualiti':54,57,101,121,123,459,495,724,770,936,950,964 'queri':134,293,320,322,325,333,342,355,800,831,1106,1119,1138,1190 'query-docu':830 'rag':2,13,37,949,956,1216 'rag-engin':1 'random':963 'rank':108,311,1182 'raw':45 're':107,1089 're-emb':1088 're-rank':106 'recal':328,463,782,1015 'recent':591 'reciproc':310,1181 'recommend':503,597,697,812,908,999,1080,1167 'reduc':414 'redund':385 'refer':1048 'refresh':1043,1084 'regardless':859 'relat':347,1192 'relationship':280 'relev':376,389,579,618,756,796,861,902,911,922,932,1009 'remov':384 'repres':488 'requir':182,1322 'rerank':753,804,815,825,835,843 'respect':509 'result':486,616,656,740,761,762,768,787,811,1125 'retriev':8,19,31,53,100,118,120,125,146,168,237,241,265,273,356,362,370,451,494,624,723,739,777,817,857,935,978,995,1002,1005,1018,1024,1064,1102,1239,1245 'retrieval-aug':7,18 'return':553,585,654,834,1045 'review':1315 'role':36 'safeti':1325 'scale':93 'scope':564,1296 'score':315,390,425,916 'search':104,151,163,172,178,284,289,400,415,497,537,566,571,612,653,751,779,1114,1123,1127,1156,1163,1171,1222,1233 'second':268 'section':209,260 'seem':764 'select':84 'semant':154,171,180,193,286,300,308,399,424,506,536,570,573,1113,1150,1155,1179,1232 'sentenc':211,377,438,477 'sentence/paragraph':514 'separ':147,727,937,1001 'set':820,913,1007 'sever':441,543,639,742,852,940,1036,1108 'sharp':431 'shift':218,522 'short':335 'similar':70,162,221,306,419,518,549,574,662,695,790,915,1177 'situat':443,545,641,744,854,942,1038,1110 'size':128,258,435,472 'skill':183,1193,1288 'skill-rag-engineer' 'sourc':411,560,608,1033,1053 'source-sickn33' 'space':416 'specif':674,691,707,716,799,1145,1310 'specifi':629 'split':474,512 'spot':143 'stage':738,776 'step':816 'stop':1316 'strategi':29,95,111,1103,1264 'structur':224,406,511,532,649 'substitut':1306 'success':1328 'summar':382,924 'symptom':450,552,651,754,862,951,1044,1120 'synonym':345 'system':11,22,38,1079 'take':745 'task':1292 'tell':976 'term':348,658,1122,1146 'test':1006,1312 'text':679 'thought':490 'threshold':912 'time':1028 'token':200,214,873 'token/character':446 'top':747,760,786,822,837,840 'top-k':746,836 'topic':217,521 'topic-agent-skills' 'topic-agentic-skills' 'topic-ai-agent-skills' 'topic-ai-agents' 'topic-ai-coding' 'topic-ai-workflows' 'topic-antigravity' 'topic-antigravity-skills' 'topic-claude-code' 'topic-claude-code-skills' 'topic-codex-cli' 'topic-codex-skills' 'track':1025,1085 'train':672 'treat':1301 'truli':921 'trust':1076 'ttl':1098 'tune':88,317,720,992,1186 'type':99,132,321,638,676,703,732,1107 'understand':49,186 'updat':1030 'use':204,210,247,276,292,331,338,369,379,403,444,505,516,547,631,677,704,735,855,910,1111,1210,1286 'user':332,496,561,589,627,1063,1130,1211,1217,1223,1228,1234,1240,1246,1252,1259,1265,1270,1275,1280 'valid':1311 'vari':252,460 'variat':343,965 'vector':26,89,159,305,418,548,611,750,778,1176,1221 'versions/hashes':1087 'want':590 'weaviat':1274 'weight':316,1187 'well':1195 'wild':461 'window':113,175,366 'without':538,581,752 'work':1194 'wors':1068 'wrong':559,592,972,990","prices":[{"id":"73b4c947-bbec-4b9d-a566-21074a0c9758","listingId":"cd55b6a6-879a-440e-b31e-8a526b5f13bd","amountUsd":"0","unit":"free","nativeCurrency":null,"nativeAmount":null,"chain":null,"payTo":null,"paymentMethod":"skill-free","isPrimary":true,"details":{"org":"sickn33","category":"antigravity-awesome-skills","install_from":"skills.sh"},"createdAt":"2026-04-18T20:38:45.476Z"}],"sources":[{"listingId":"cd55b6a6-879a-440e-b31e-8a526b5f13bd","source":"github","sourceId":"sickn33/antigravity-awesome-skills/rag-engineer","sourceUrl":"https://github.com/sickn33/antigravity-awesome-skills/tree/main/skills/rag-engineer","isPrimary":false,"firstSeenAt":"2026-04-18T21:43:13.561Z","lastSeenAt":"2026-04-25T06:51:48.827Z"},{"listingId":"cd55b6a6-879a-440e-b31e-8a526b5f13bd","source":"skills_sh","sourceId":"sickn33/antigravity-awesome-skills/rag-engineer","sourceUrl":"https://skills.sh/sickn33/antigravity-awesome-skills/rag-engineer","isPrimary":true,"firstSeenAt":"2026-04-18T20:38:45.476Z","lastSeenAt":"2026-04-23T16:40:49.615Z"}],"details":{"listingId":"cd55b6a6-879a-440e-b31e-8a526b5f13bd","quickStartSnippet":null,"exampleRequest":null,"exampleResponse":null,"schema":null,"openapiUrl":null,"agentsTxtUrl":null,"citations":[],"useCases":[],"bestFor":[],"notFor":[],"kindDetails":{"org":"sickn33","slug":"rag-engineer","github":{"repo":"sickn33/antigravity-awesome-skills","stars":34997,"topics":["agent-skills","agentic-skills","ai-agent-skills","ai-agents","ai-coding","ai-workflows","antigravity","antigravity-skills","claude-code","claude-code-skills","codex-cli","codex-skills","cursor","cursor-skills","developer-tools","gemini-cli","gemini-skills","kiro","mcp","skill-library"],"license":"mit","html_url":"https://github.com/sickn33/antigravity-awesome-skills","pushed_at":"2026-04-25T06:33:17Z","description":"Installable GitHub library of 1,400+ agentic skills for Claude Code, Cursor, Codex CLI, Gemini CLI, Antigravity, and more. Includes installer CLI, bundles, workflows, and official/community skill collections.","skill_md_sha":"d38be863e2762d49667bcfae221a6dea1b714c8d","skill_md_path":"skills/rag-engineer/SKILL.md","default_branch":"main","skill_tree_url":"https://github.com/sickn33/antigravity-awesome-skills/tree/main/skills/rag-engineer"},"layout":"multi","source":"github","category":"antigravity-awesome-skills","frontmatter":{"name":"rag-engineer","description":"Expert in building Retrieval-Augmented Generation systems. Masters"},"skills_sh_url":"https://skills.sh/sickn33/antigravity-awesome-skills/rag-engineer"},"updatedAt":"2026-04-25T06:51:48.827Z"}}