{"id":"e98898ca-97f2-4e0d-85c3-d8cea81b94f5","shortId":"u72X3w","kind":"skill","title":"dspy-gepa-reflective","tagline":"This skill should be used when the user asks to \"optimize an agent with GEPA\", \"use reflective optimization\", \"optimize ReAct agents\", \"provide feedback metrics\", mentions \"GEPA optimizer\", \"LLM reflection\", \"execution trajectories\", \"agentic systems optimization\", or needs to op","description":"# DSPy GEPA Optimizer\n\n## Goal\n\nOptimize complex agentic systems using LLM reflection on full execution traces with Pareto-based evolutionary search.\n\n## When to Use\n\n- **Agentic systems** with tool use\n- When you have **rich textual feedback** on failures\n- Complex multi-step workflows\n- Instruction-only optimization needed\n\n## Related Skills\n\n- For non-agentic programs: [dspy-miprov2-optimizer](../dspy-miprov2-optimizer/SKILL.md), [dspy-bootstrap-fewshot](../dspy-bootstrap-fewshot/SKILL.md)\n- Measure improvements: [dspy-evaluation-suite](../dspy-evaluation-suite/SKILL.md)\n\n## Inputs\n\n| Input | Type | Description |\n|-------|------|-------------|\n| `program` | `dspy.Module` | Agent or complex program |\n| `trainset` | `list[dspy.Example]` | Training examples |\n| `metric` | `callable` | Must return `(score, feedback)` tuple |\n| `reflection_lm` | `dspy.LM` | Strong LM for reflection (GPT-4) |\n| `auto` | `str` | \"light\", \"medium\", \"heavy\" |\n\n## Outputs\n\n| Output | Type | Description |\n|--------|------|-------------|\n| `compiled_program` | `dspy.Module` | Reflectively optimized program |\n\n## Workflow\n\n### Phase 1: Define Feedback Metric\n\nGEPA requires metrics that return *textual feedback*:\n\n```python\ndef gepa_metric(example, pred, trace=None):\n    \"\"\"Must return (score, feedback) tuple.\"\"\"\n    is_correct = example.answer.lower() in pred.answer.lower()\n    \n    if is_correct:\n        feedback = \"Correct. The answer accurately addresses the question.\"\n    else:\n        feedback = f\"Incorrect. Expected '{example.answer}' but got '{pred.answer}'. The model may have misunderstood the question or retrieved irrelevant information.\"\n    \n    return is_correct, feedback\n```\n\n### Phase 2: Setup Agent\n\n```python\nimport dspy\n\ndef search(query: str) -> list[str]:\n    \"\"\"Search knowledge base for relevant information.\"\"\"\n    rm = dspy.ColBERTv2(url='http://20.102.90.50:2017/wiki17_abstracts')\n    results = rm(query, k=3)\n    return results if isinstance(results, list) else [results]\n\ndef calculate(expression: str) -> float:\n    \"\"\"Safely evaluate mathematical expressions.\"\"\"\n    with dspy.PythonInterpreter() as interp:\n        return interp(expression)\n\nagent = dspy.ReAct(\"question -> answer\", tools=[search, calculate])\n```\n\n### Phase 3: Optimize with GEPA\n\n```python\ndspy.configure(lm=dspy.LM(\"openai/gpt-4o-mini\"))\n\noptimizer = dspy.GEPA(\n    metric=gepa_metric,\n    reflection_lm=dspy.LM(\"openai/gpt-4o\"),  # Strong model for reflection\n    auto=\"medium\"\n)\n\ncompiled_agent = optimizer.compile(agent, trainset=trainset)\n```\n\n## Production Example\n\n```python\nimport dspy\nfrom dspy.evaluate import Evaluate\nimport logging\n\nlogger = logging.getLogger(__name__)\n\nclass ResearchAgent(dspy.Module):\n    def __init__(self):\n        self.react = dspy.ReAct(\n            \"question -> answer\",\n            tools=[self.search, self.summarize]\n        )\n    \n    def search(self, query: str) -> list[str]:\n        \"\"\"Search for relevant documents.\"\"\"\n        rm = dspy.ColBERTv2(url='http://20.102.90.50:2017/wiki17_abstracts')\n        results = rm(query, k=5)\n        return results if isinstance(results, list) else [results]\n    \n    def summarize(self, text: str) -> str:\n        \"\"\"Summarize long text into key points.\"\"\"\n        summarizer = dspy.Predict(\"text -> summary\")\n        return summarizer(text=text).summary\n    \n    def forward(self, question):\n        return self.react(question=question)\n\ndef detailed_feedback_metric(example, pred, trace=None):\n    \"\"\"Rich feedback for GEPA reflection.\"\"\"\n    expected = example.answer.lower().strip()\n    actual = pred.answer.lower().strip() if pred.answer else \"\"\n    \n    # Exact match\n    if expected == actual:\n        return 1.0, \"Perfect match. Answer is correct and concise.\"\n    \n    # Partial match\n    if expected in actual or actual in expected:\n        return 0.7, f\"Partial match. Expected '{example.answer}', got '{pred.answer}'. Answer contains correct info but may be verbose or incomplete.\"\n    \n    # Check for key terms\n    expected_terms = set(expected.split())\n    actual_terms = set(actual.split())\n    overlap = len(expected_terms & actual_terms) / max(len(expected_terms), 1)\n    \n    if overlap > 0.5:\n        return 0.5, f\"Some overlap. Expected '{example.answer}', got '{pred.answer}'. Key terms present but answer structure differs.\"\n    \n    return 0.0, f\"Incorrect. Expected '{example.answer}', got '{pred.answer}'. The agent may need better search queries or reasoning.\"\n\ndef optimize_research_agent(trainset, devset):\n    \"\"\"Full GEPA optimization pipeline.\"\"\"\n    \n    dspy.configure(lm=dspy.LM(\"openai/gpt-4o-mini\"))\n    \n    agent = ResearchAgent()\n    \n    # Convert metric for evaluation (just score)\n    def eval_metric(example, pred, trace=None):\n        score, _ = detailed_feedback_metric(example, pred, trace)\n        return score\n    \n    evaluator = Evaluate(devset=devset, num_threads=8, metric=eval_metric)\n    baseline = evaluator(agent)\n    logger.info(f\"Baseline: {baseline:.2%}\")\n    \n    # GEPA optimization\n    optimizer = dspy.GEPA(\n        metric=detailed_feedback_metric,\n        reflection_lm=dspy.LM(\"openai/gpt-4o\"),\n        auto=\"medium\",\n        enable_tool_optimization=True  # Also optimize tool descriptions\n    )\n    \n    compiled = optimizer.compile(agent, trainset=trainset)\n    optimized = evaluator(compiled)\n    logger.info(f\"Optimized: {optimized:.2%}\")\n    \n    compiled.save(\"research_agent_gepa.json\")\n    return compiled\n```\n\n## Tool Optimization\n\nGEPA can jointly optimize predictor instructions AND tool descriptions:\n\n```python\noptimizer = dspy.GEPA(\n    metric=gepa_metric,\n    reflection_lm=dspy.LM(\"openai/gpt-4o\"),\n    auto=\"medium\",\n    enable_tool_optimization=True  # Optimize tool docstrings too\n)\n```\n\n## Best Practices\n\n1. **Rich feedback** - More detailed feedback = better reflection\n2. **Strong reflection LM** - Use GPT-4 or Claude for reflection\n3. **Agentic focus** - Best for ReAct and multi-tool systems\n4. **Trace analysis** - GEPA analyzes full execution trajectories\n\n## Limitations\n\n- Requires custom feedback metrics (not just scores)\n- Expensive: uses strong LM for reflection\n- Newer optimizer, less battle-tested than MIPROv2\n- Best for instruction optimization, less for demos\n\n## Official Documentation\n\n- **DSPy Documentation**: [https://dspy.ai/](https://dspy.ai/)\n- **DSPy GitHub**: [https://github.com/stanfordnlp/dspy](https://github.com/stanfordnlp/dspy)\n- **GEPA Optimizer**: [https://dspy.ai/api/optimizers/GEPA/](https://dspy.ai/api/optimizers/GEPA/)\n- **Agents Guide**: [https://dspy.ai/tutorials/agents/](https://dspy.ai/tutorials/agents/)","tags":["dspy","gepa","reflective","skills","omidzamani","agent-skills","claude-code","claude-skills","llm","prompt-optimization","rag"],"capabilities":["skill","source-omidzamani","skill-dspy-gepa-reflective","topic-agent-skills","topic-claude-code","topic-claude-skills","topic-dspy","topic-llm","topic-prompt-optimization","topic-rag"],"categories":["dspy-skills"],"synonyms":[],"warnings":[],"endpointUrl":"https://skills.sh/OmidZamani/dspy-skills/dspy-gepa-reflective","protocol":"skill","transport":"skills-sh","auth":{"type":"none","details":{"cli":"npx skills add OmidZamani/dspy-skills","source_repo":"https://github.com/OmidZamani/dspy-skills","install_from":"skills.sh"}},"qualityScore":"0.487","qualityRationale":"deterministic score 0.49 from registry signals: · indexed on github topic:agent-skills · 74 github stars · SKILL.md body (6,623 chars)","verified":false,"liveness":"unknown","lastLivenessCheck":null,"agentReviews":{"count":0,"score_avg":null,"cost_usd_avg":null,"success_rate":null,"latency_p50_ms":null,"narrative_summary":null,"summary_updated_at":null},"enrichmentModel":"deterministic:skill-github:v1","enrichmentVersion":1,"enrichedAt":"2026-05-02T06:55:44.349Z","embedding":null,"createdAt":"2026-04-18T22:14:12.601Z","updatedAt":"2026-05-02T06:55:44.349Z","lastSeenAt":"2026-05-02T06:55:44.349Z","tsv":"'-4':144,668 '/](https://dspy.ai/)':727 '/api/optimizers/gepa/](https://dspy.ai/api/optimizers/gepa/)':737 '/dspy-bootstrap-fewshot/skill.md':106 '/dspy-evaluation-suite/skill.md':113 '/dspy-miprov2-optimizer/skill.md':101 '/stanfordnlp/dspy](https://github.com/stanfordnlp/dspy)':732 '/tutorials/agents/](https://dspy.ai/tutorials/agents/)':742 '0.0':510 '0.5':492,494 '0.7':449 '1':162,489,654 '1.0':430 '2':227,581,616,662 '20.102.90.50':248,358 '2017/wiki17_abstracts':249,359 '3':254,287,673 '4':684 '5':364 '8':570 'accur':198 'actual':418,428,443,445,475,483 'actual.split':478 'address':199 'agent':17,25,36,49,67,95,120,229,279,312,314,518,529,540,576,606,674,738 'also':600 'analysi':686 'analyz':688 'answer':197,282,340,433,457,506 'ask':13 'auto':145,309,594,642 'base':61,241 'baselin':574,579,580 'battl':710 'battle-test':709 'best':652,676,714 'better':521,660 'bootstrap':104 'calcul':264,285 'callabl':130 'check':467 'class':331 'claud':670 'compil':154,311,604,611,620 'compiled.save':617 'complex':48,80,122 'concis':437 'contain':458 'convert':542 'correct':187,193,195,224,435,459 'custom':694 'def':174,233,263,334,344,373,394,402,526,548 'defin':163 'demo':720 'descript':117,153,603,631 'detail':403,556,587,658 'devset':531,566,567 'differ':508 'docstr':650 'document':354,722,724 'dspi':2,43,98,103,110,232,321,723,728 'dspy-bootstrap-fewshot':102 'dspy-evaluation-suit':109 'dspy-gepa-reflect':1 'dspy-miprov2-optimizer':97 'dspy.ai':726,736,741 'dspy.ai/](https://dspy.ai/)':725 'dspy.ai/api/optimizers/gepa/](https://dspy.ai/api/optimizers/gepa/)':735 'dspy.ai/tutorials/agents/](https://dspy.ai/tutorials/agents/)':740 'dspy.colbertv2':246,356 'dspy.configure':292,536 'dspy.evaluate':323 'dspy.example':126 'dspy.gepa':297,585,634 'dspy.lm':138,294,303,538,592,640 'dspy.module':119,156,333 'dspy.predict':386 'dspy.pythoninterpreter':273 'dspy.react':280,338 'els':202,261,371,423 'enabl':596,644 'eval':549,572 'evalu':111,269,325,545,564,565,575,610 'evolutionari':62 'exact':424 'exampl':128,177,318,406,551,559 'example.answer':207,454,499,514 'example.answer.lower':188,416 'execut':34,56,690 'expect':206,415,427,441,447,453,471,481,487,498,513 'expected.split':474 'expens':700 'express':265,271,278 'f':204,450,495,511,578,613 'failur':79 'feedback':27,77,134,164,172,184,194,203,225,404,411,557,588,656,659,695 'fewshot':105 'float':267 'focus':675 'forward':395 'full':55,532,689 'gepa':3,19,30,44,166,175,290,299,413,533,582,623,636,687,733 'github':729 'github.com':731 'github.com/stanfordnlp/dspy](https://github.com/stanfordnlp/dspy)':730 'goal':46 'got':209,455,500,515 'gpt':143,667 'guid':739 'heavi':149 'import':231,320,324,326 'improv':108 'incomplet':466 'incorrect':205,512 'info':460 'inform':221,244 'init':335 'input':114,115 'instruct':86,628,716 'instruction-on':85 'interp':275,277 'irrelev':220 'isinst':258,368 'joint':625 'k':253,363 'key':383,469,502 'knowledg':240 'len':480,486 'less':708,718 'light':147 'limit':692 'list':125,237,260,349,370 'llm':32,52 'lm':137,140,293,302,537,591,639,665,703 'log':327 'logger':328 'logger.info':577,612 'logging.getlogger':329 'long':380 'match':425,432,439,452 'mathemat':270 'max':485 'may':213,462,519 'measur':107 'medium':148,310,595,643 'mention':29 'metric':28,129,165,168,176,298,300,405,543,550,558,571,573,586,589,635,637,696 'miprov2':99,713 'misunderstood':215 'model':212,306 'multi':82,681 'multi-step':81 'multi-tool':680 'must':131,181 'name':330 'need':40,89,520 'newer':706 'non':94 'non-agent':93 'none':180,409,554 'num':568 'offici':721 'op':42 'openai/gpt-4o':304,593,641 'openai/gpt-4o-mini':295,539 'optim':15,22,23,31,38,45,47,88,100,158,288,296,527,534,583,584,598,601,609,614,615,622,626,633,646,648,707,717,734 'optimizer.compile':313,605 'output':150,151 'overlap':479,491,497 'pareto':60 'pareto-bas':59 'partial':438,451 'perfect':431 'phase':161,226,286 'pipelin':535 'point':384 'practic':653 'pred':178,407,552,560 'pred.answer':210,422,456,501,516 'pred.answer.lower':190,419 'predictor':627 'present':504 'product':317 'program':96,118,123,155,159 'provid':26 'python':173,230,291,319,632 'queri':235,252,347,362,523 'question':201,217,281,339,397,400,401 'react':24,678 'reason':525 'reflect':4,21,33,53,136,142,157,301,308,414,590,638,661,664,672,705 'relat':90 'relev':243,353 'requir':167,693 'research':528 'research_agent_gepa.json':618 'researchag':332,541 'result':250,256,259,262,360,366,369,372 'retriev':219 'return':132,170,182,222,255,276,365,389,398,429,448,493,509,562,619 'rich':75,410,655 'rm':245,251,355,361 'safe':268 'score':133,183,547,555,563,699 'search':63,234,239,284,345,351,522 'self':336,346,375,396 'self.react':337,399 'self.search':342 'self.summarize':343 'set':473,477 'setup':228 'skill':6,91 'skill-dspy-gepa-reflective' 'source-omidzamani' 'step':83 'str':146,236,238,266,348,350,377,378 'strip':417,420 'strong':139,305,663,702 'structur':507 'suit':112 'summar':374,379,385,390 'summari':388,393 'system':37,50,68,683 'term':470,472,476,482,484,488,503 'test':711 'text':376,381,387,391,392 'textual':76,171 'thread':569 'tool':70,283,341,597,602,621,630,645,649,682 'topic-agent-skills' 'topic-claude-code' 'topic-claude-skills' 'topic-dspy' 'topic-llm' 'topic-prompt-optimization' 'topic-rag' 'trace':57,179,408,553,561,685 'train':127 'trainset':124,315,316,530,607,608 'trajectori':35,691 'true':599,647 'tupl':135,185 'type':116,152 'url':247,357 'use':9,20,51,66,71,666,701 'user':12 'verbos':464 'workflow':84,160","prices":[{"id":"98d9c09d-2be2-4ec6-b247-f776b65883a8","listingId":"e98898ca-97f2-4e0d-85c3-d8cea81b94f5","amountUsd":"0","unit":"free","nativeCurrency":null,"nativeAmount":null,"chain":null,"payTo":null,"paymentMethod":"skill-free","isPrimary":true,"details":{"org":"OmidZamani","category":"dspy-skills","install_from":"skills.sh"},"createdAt":"2026-04-18T22:14:12.601Z"}],"sources":[{"listingId":"e98898ca-97f2-4e0d-85c3-d8cea81b94f5","source":"github","sourceId":"OmidZamani/dspy-skills/dspy-gepa-reflective","sourceUrl":"https://github.com/OmidZamani/dspy-skills/tree/master/skills/dspy-gepa-reflective","isPrimary":false,"firstSeenAt":"2026-04-18T22:14:12.601Z","lastSeenAt":"2026-05-02T06:55:44.349Z"}],"details":{"listingId":"e98898ca-97f2-4e0d-85c3-d8cea81b94f5","quickStartSnippet":null,"exampleRequest":null,"exampleResponse":null,"schema":null,"openapiUrl":null,"agentsTxtUrl":null,"citations":[],"useCases":[],"bestFor":[],"notFor":[],"kindDetails":{"org":"OmidZamani","slug":"dspy-gepa-reflective","github":{"repo":"OmidZamani/dspy-skills","stars":74,"topics":["agent-skills","claude-code","claude-skills","dspy","llm","prompt-optimization","rag"],"license":"mit","html_url":"https://github.com/OmidZamani/dspy-skills","pushed_at":"2026-02-21T12:49:43Z","description":"Collection of Claude Skills for DSPy framework - program language models, optimize prompts, and build RAG pipelines systematically","skill_md_sha":"6d3c23fdab1ae883bda2d07f19ab9196180292b7","skill_md_path":"skills/dspy-gepa-reflective/SKILL.md","default_branch":"master","skill_tree_url":"https://github.com/OmidZamani/dspy-skills/tree/master/skills/dspy-gepa-reflective"},"layout":"multi","source":"github","category":"dspy-skills","frontmatter":{"name":"dspy-gepa-reflective","description":"This skill should be used when the user asks to \"optimize an agent with GEPA\", \"use reflective optimization\", \"optimize ReAct agents\", \"provide feedback metrics\", mentions \"GEPA optimizer\", \"LLM reflection\", \"execution trajectories\", \"agentic systems optimization\", or needs to optimize complex multi-step agents using textual feedback on execution traces."},"skills_sh_url":"https://skills.sh/OmidZamani/dspy-skills/dspy-gepa-reflective"},"updatedAt":"2026-05-02T06:55:44.349Z"}}