{"id":"1a68afaf-3cf8-4c9d-a82c-7c631cf20bb2","shortId":"YvQftV","kind":"skill","title":"dspy-simba-optimizer","tagline":"This skill should be used when the user asks to \"optimize with SIMBA\", \"use Bayesian optimization\", \"optimize agents with custom feedback\", mentions \"SIMBA optimizer\", \"mini-batch optimization\", \"statistical optimization\", \"lightweight optimizer\", or needs an alternative to MIPRO","description":"# DSPy SIMBA Optimizer\n\n## Goal\n\nOptimize DSPy programs using mini-batch Bayesian optimization with statistical analysis of feedback signals.\n\n## When to Use\n\n- Need lighter-weight alternative to GEPA\n- Have custom feedback metrics (not just accuracy)\n- Agentic tasks with rich failure signals\n- Budget-conscious optimization (fewer eval calls)\n- Programs where few-shot examples aren't critical\n\n## Related Skills\n\n- Alternative optimizers: [dspy-miprov2-optimizer](../dspy-miprov2-optimizer/SKILL.md), [dspy-gepa-reflective](../dspy-gepa-reflective/SKILL.md)\n- Agent optimization: [dspy-react-agent-builder](../dspy-react-agent-builder/SKILL.md)\n- Evaluation: [dspy-evaluation-suite](../dspy-evaluation-suite/SKILL.md)\n\n## Inputs\n\n| Input | Type | Description |\n|-------|------|-------------|\n| `program` | `dspy.Module` | Program to optimize |\n| `trainset` | `list[dspy.Example]` | Training examples |\n| `metric` | `callable` | Returns float or `dspy.Prediction(score=..., feedback=...)` |\n| `max_steps` | `int` | Number of optimization steps |\n| `bsize` | `int` | Mini-batch size |\n\n## Outputs\n\n| Output | Type | Description |\n|--------|------|-------------|\n| `optimized_program` | `dspy.Module` | SIMBA-optimized program |\n\n## Workflow\n\n### Phase 1: Understand SIMBA\n\n**SIMBA** (Stochastic Introspective Mini-Batch Ascent):\n- Iterative prompt optimization with mini-batch sampling\n- Identifies challenging examples with high output variability\n- Generates self-reflective rules or adds successful demonstrations\n- Lighter than GEPA (no reflection LM)\n- More flexible than Bootstrap (uses feedback)\n\n**Comparison:**\n- **MIPROv2**: Best accuracy, lots of data\n- **GEPA**: Agentic systems, expensive\n- **SIMBA**: Custom feedback, budget-friendly\n- **Bootstrap**: Simplest, demo-based\n\n### Phase 2: Basic SIMBA Optimization\n\n```python\nimport dspy\n\ndspy.configure(lm=dspy.LM(\"openai/gpt-4o-mini\"))\n\n# Program to optimize\nclass QAPipeline(dspy.Module):\n    def __init__(self):\n        self.generate = dspy.ChainOfThought(\"question -> answer\")\n\n    def forward(self, question):\n        return self.generate(question=question)\n\n# Metric (can return just score or (score, feedback))\ndef qa_metric(example, pred, trace=None):\n    correct = example.answer.lower() in pred.answer.lower()\n    return 1.0 if correct else 0.0\n\n# SIMBA optimizer\noptimizer = dspy.SIMBA(\n    metric=qa_metric,\n    max_steps=10,  # Optimization iterations\n    bsize=5  # Mini-batch size\n)\n\nprogram = QAPipeline()\ncompiled = optimizer.compile(program, trainset=trainset)\ncompiled.save(\"qa_simba.json\")\n```\n\n### Phase 3: SIMBA with Feedback Signals\n\nSIMBA works best with rich feedback:\n\n```python\nimport dspy\n\ndef detailed_metric(example, pred, trace=None):\n    \"\"\"Metric with feedback signal.\"\"\"\n    expected = example.answer.lower()\n    actual = pred.answer.lower()\n\n    if expected == actual:\n        return dspy.Prediction(score=1.0, feedback=\"Perfect match\")\n    elif expected in actual:\n        return dspy.Prediction(score=0.7, feedback=f\"Contains answer but verbose: '{actual}'\")\n    else:\n        overlap = len(set(expected.split()) & set(actual.split()))\n        if overlap > 0:\n            return dspy.Prediction(score=0.3, feedback=f\"Partial overlap: {overlap} words\")\n        return dspy.Prediction(score=0.0, feedback=f\"No match. Expected '{expected}'\")\n\noptimizer = dspy.SIMBA(\n    metric=detailed_metric,\n    max_steps=20,  # Optimization iterations\n    bsize=8  # Mini-batch size\n)\n\ncompiled = optimizer.compile(program, trainset=trainset)\n```\n\n### Phase 4: Production Agent Optimization\n\n```python\nimport dspy\nfrom dspy.evaluate import Evaluate\nimport logging\n\nlogger = logging.getLogger(__name__)\n\n# Define tools as functions\ndef search(query: str) -> str:\n    \"\"\"Search knowledge base for relevant information.\"\"\"\n    retriever = dspy.ColBERTv2(url='http://20.102.90.50:2017/wiki17_abstracts')\n    results = retriever(query, k=3)\n    return \"\\n\".join([r['text'] for r in results])\n\ndef calculate(expr: str) -> str:\n    \"\"\"Evaluate Python expressions safely.\"\"\"\n    try:\n        with dspy.PythonInterpreter() as interp:\n            return str(interp.execute(expr))\n    except Exception as e:\n        return f\"Error: {e}\"\n\nclass ResearchAgent(dspy.Module):\n    def __init__(self):\n        self.agent = dspy.ReAct(\n            \"question -> answer\",\n            tools=[search, calculate]\n        )\n\n    def forward(self, question):\n        return self.agent(question=question)\n\ndef agent_metric(example, pred, trace=None):\n    \"\"\"Rich metric for agent optimization.\"\"\"\n    expected = example.answer.lower().strip()\n    actual = pred.answer.lower().strip() if pred.answer else \"\"\n\n    # Exact match\n    if expected == actual:\n        return dspy.Prediction(score=1.0, feedback=\"Correct answer\")\n\n    # Partial match\n    if expected in actual:\n        return dspy.Prediction(score=0.7, feedback=\"Answer contains expected result\")\n\n    # Check key terms\n    expected_terms = set(expected.split())\n    actual_terms = set(actual.split())\n    overlap = len(expected_terms & actual_terms)\n\n    if overlap >= len(expected_terms) * 0.5:\n        return dspy.Prediction(score=0.5, feedback=f\"50%+ term overlap\")\n\n    return dspy.Prediction(score=0.0, feedback=f\"Incorrect: expected '{example.answer}'\")\n\ndef optimize_agent(trainset, devset):\n    \"\"\"Full SIMBA optimization pipeline.\"\"\"\n    dspy.configure(lm=dspy.LM(\"openai/gpt-4o-mini\"))\n\n    agent = ResearchAgent()\n\n    # Baseline evaluation\n    eval_metric = lambda ex, pred, trace: agent_metric(ex, pred, trace).score\n    evaluator = dspy.Evaluate(devset=devset, metric=eval_metric, num_threads=4)\n    baseline = evaluator(agent)\n    logger.info(f\"Baseline: {baseline:.2%}\")\n\n    # SIMBA optimization\n    optimizer = dspy.SIMBA(\n        metric=agent_metric,\n        max_steps=25,  # Optimization iterations\n        bsize=6  # Mini-batch size\n    )\n\n    compiled = optimizer.compile(agent, trainset=trainset)\n\n    # Evaluate optimized\n    optimized = evaluator(compiled)\n    logger.info(f\"SIMBA optimized: {optimized:.2%}\")\n\n    compiled.save(\"research_agent_simba.json\")\n    return compiled\n```\n\n## Configuration\n\n```python\noptimizer = dspy.SIMBA(\n    metric=metric_fn,\n    max_steps=20,                          # Optimization iterations\n    bsize=32,                              # Mini-batch size (default: 32)\n    num_candidates=6,                      # Candidates per iteration (default: 6)\n    max_demos=4,                           # Max demos per predictor (default: 4)\n    temperature_for_sampling=0.2,          # Sampling temperature (default: 0.2)\n    temperature_for_candidates=0.2         # Candidate selection temperature (default: 0.2)\n)\n```\n\n## Best Practices\n\n1. **Use feedback signals** - SIMBA benefits from `dspy.Prediction(score=..., feedback=...)` objects\n2. **Balance parameters** - Adjust `bsize` (default 32) and `max_steps` (default 8) based on dataset size\n3. **Patience** - SIMBA is slower than Bootstrap, faster than GEPA\n4. **Custom metrics** - Best for scenarios with nuanced scoring (not binary)\n5. **Tune temperatures** - Lower temperatures (0.1-0.3) for exploitation, higher (0.5-1.0) for exploration\n\n## Limitations\n\n- Newer optimizer, less battle-tested than MIPROv2\n- Requires thoughtful metric design (garbage in, garbage out)\n- Not as thorough as GEPA for agent optimization\n- Mini-batch sampling adds variance to results\n- No automatic prompt reflection like GEPA\n\n## Official Documentation\n\n- **DSPy Documentation**: https://dspy.ai/\n- **DSPy GitHub**: https://github.com/stanfordnlp/dspy\n- **SIMBA Optimizer**: https://dspy.ai/api/optimizers/SIMBA/\n- **Optimizers Guide**: https://dspy.ai/learn/optimization/optimizers/","tags":["dspy","simba","optimizer","skills","omidzamani","agent-skills","claude-code","claude-skills","llm","prompt-optimization","rag"],"capabilities":["skill","source-omidzamani","skill-dspy-simba-optimizer","topic-agent-skills","topic-claude-code","topic-claude-skills","topic-dspy","topic-llm","topic-prompt-optimization","topic-rag"],"categories":["dspy-skills"],"synonyms":[],"warnings":[],"endpointUrl":"https://skills.sh/OmidZamani/dspy-skills/dspy-simba-optimizer","protocol":"skill","transport":"skills-sh","auth":{"type":"none","details":{"cli":"npx skills add OmidZamani/dspy-skills","source_repo":"https://github.com/OmidZamani/dspy-skills","install_from":"skills.sh"}},"qualityScore":"0.487","qualityRationale":"deterministic score 0.49 from registry signals: · indexed on github topic:agent-skills · 74 github stars · SKILL.md body (7,692 chars)","verified":false,"liveness":"unknown","lastLivenessCheck":null,"agentReviews":{"count":0,"score_avg":null,"cost_usd_avg":null,"success_rate":null,"latency_p50_ms":null,"narrative_summary":null,"summary_updated_at":null},"enrichmentModel":"deterministic:skill-github:v1","enrichmentVersion":1,"enrichedAt":"2026-05-02T06:55:44.918Z","embedding":null,"createdAt":"2026-04-18T22:14:18.829Z","updatedAt":"2026-05-02T06:55:44.918Z","lastSeenAt":"2026-05-02T06:55:44.918Z","tsv":"'-0.3':818 '-1.0':823 '/api/optimizers/simba/':879 '/dspy-evaluation-suite/skill.md':128 '/dspy-gepa-reflective/skill.md':114 '/dspy-miprov2-optimizer/skill.md':109 '/dspy-react-agent-builder/skill.md':122 '/learn/optimization/optimizers/':884 '/stanfordnlp/dspy':874 '0':394 '0.0':302,408,617 '0.1':817 '0.2':748,752,756,761 '0.3':398 '0.5':604,608,822 '0.7':377,576 '1':177,764 '1.0':298,366,563 '10':312 '2':246,669,703,775 '20':422,717 '20.102.90.50':471 '2017/wiki17_abstracts':472 '25':679 '3':331,477,791 '32':721,727,781 '4':437,661,738,744,801 '5':316,812 '50':611 '6':683,730,735 '8':426,786 'accuraci':78,226 'actual':358,362,373,384,549,559,572,589,597 'actual.split':391,592 'add':208,855 'adjust':778 'agent':22,79,115,120,231,439,535,544,625,636,646,664,675,690,849 'altern':40,69,103 'analysi':58 'answer':269,381,522,566,578 'aren':98 'ascent':186 'ask':13 'automat':860 'balanc':776 'base':244,464,787 'baselin':638,662,667,668 'basic':247 'batch':31,53,162,185,193,319,429,686,724,853 'battl':831 'battle-test':830 'bayesian':19,54 'benefit':769 'best':225,338,762,804 'binari':811 'bootstrap':220,240,797 'bsize':158,315,425,682,720,779 'budget':86,238 'budget-consci':85 'budget-friend':237 'builder':121 'calcul':488,525 'call':91 'callabl':144 'candid':729,731,755,757 'challeng':196 'check':582 'class':260,513 'comparison':223 'compil':323,431,688,697,707 'compiled.save':328,704 'configur':708 'conscious':87 'contain':380,579 'correct':293,300,565 'critic':100 'custom':24,73,235,802 'data':229 'dataset':789 'def':263,270,286,345,457,487,516,526,534,623 'default':726,734,743,751,760,780,785 'defin':453 'demo':243,737,740 'demo-bas':242 'demonstr':210 'descript':132,167 'design':838 'detail':346,418 'devset':627,654,655 'document':866,868 'dspi':2,43,48,106,111,118,125,252,344,443,867,870 'dspy-evaluation-suit':124 'dspy-gepa-reflect':110 'dspy-miprov2-optimizer':105 'dspy-react-agent-build':117 'dspy-simba-optim':1 'dspy.ai':869,878,883 'dspy.ai/api/optimizers/simba/':877 'dspy.ai/learn/optimization/optimizers/':882 'dspy.chainofthought':267 'dspy.colbertv2':469 'dspy.configure':253,632 'dspy.evaluate':445,653 'dspy.example':140 'dspy.lm':255,634 'dspy.module':134,170,262,515 'dspy.prediction':148,364,375,396,406,561,574,606,615,771 'dspy.pythoninterpreter':498 'dspy.react':520 'dspy.simba':306,416,673,711 'e':508,512 'elif':370 'els':301,385,554 'error':511 'eval':90,640,657 'evalu':123,126,447,492,639,652,663,693,696 'ex':643,648 'exact':555 'exampl':97,142,197,289,348,537 'example.answer':622 'example.answer.lower':294,357,547 'except':505,506 'expect':356,361,371,413,414,546,558,570,580,585,595,602,621 'expected.split':389,588 'expens':233 'exploit':820 'explor':825 'expr':489,504 'express':494 'f':379,400,410,510,610,619,666,699 'failur':83 'faster':798 'feedback':25,60,74,150,222,236,285,334,341,354,367,378,399,409,564,577,609,618,766,773 'few-shot':94 'fewer':89 'flexibl':218 'float':146 'fn':714 'forward':271,527 'friend':239 'full':628 'function':456 'garbag':839,841 'generat':202 'gepa':71,112,213,230,800,847,864 'github':871 'github.com':873 'github.com/stanfordnlp/dspy':872 'goal':46 'guid':881 'high':199 'higher':821 'identifi':195 'import':251,343,442,446,448 'incorrect':620 'inform':467 'init':264,517 'input':129,130 'int':153,159 'interp':500 'interp.execute':503 'introspect':182 'iter':187,314,424,681,719,733 'join':480 'k':476 'key':583 'knowledg':463 'lambda':642 'len':387,594,601 'less':829 'lighter':67,211 'lighter-weight':66 'lightweight':35 'like':863 'limit':826 'list':139 'lm':216,254,633 'log':449 'logger':450 'logger.info':665,698 'logging.getlogger':451 'lot':227 'lower':815 'match':369,412,556,568 'max':151,310,420,677,715,736,739,783 'mention':26 'metric':75,143,278,288,307,309,347,352,417,419,536,542,641,647,656,658,674,676,712,713,803,837 'mini':30,52,161,184,192,318,428,685,723,852 'mini-batch':29,51,160,183,191,317,427,684,722,851 'mipro':42 'miprov2':107,224,834 'n':479 'name':452 'need':38,65 'newer':827 'none':292,351,540 'nuanc':808 'num':659,728 'number':154 'object':774 'offici':865 'openai/gpt-4o-mini':256,635 'optim':4,15,20,21,28,32,34,36,45,47,55,88,104,108,116,137,156,168,173,189,249,259,304,305,313,415,423,440,545,624,630,671,672,680,694,695,701,702,710,718,828,850,876,880 'optimizer.compile':324,432,689 'output':164,165,200 'overlap':386,393,402,403,593,600,613 'paramet':777 'partial':401,567 'patienc':792 'per':732,741 'perfect':368 'phase':176,245,330,436 'pipelin':631 'practic':763 'pred':290,349,538,644,649 'pred.answer':553 'pred.answer.lower':296,359,550 'predictor':742 'product':438 'program':49,92,133,135,169,174,257,321,325,433 'prompt':188,861 'python':250,342,441,493,709 'qa':287,308 'qa_simba.json':329 'qapipelin':261,322 'queri':459,475 'question':268,273,276,277,521,529,532,533 'r':481,484 'react':119 'reflect':113,205,215,862 'relat':101 'relev':466 'requir':835 'research_agent_simba.json':705 'researchag':514,637 'result':473,486,581,858 'retriev':468,474 'return':145,274,280,297,363,374,395,405,478,501,509,530,560,573,605,614,706 'rich':82,340,541 'rule':206 'safe':495 'sampl':194,747,749,854 'scenario':806 'score':149,282,284,365,376,397,407,562,575,607,616,651,772,809 'search':458,462,524 'select':758 'self':204,265,272,518,528 'self-reflect':203 'self.agent':519,531 'self.generate':266,275 'set':388,390,587,591 'shot':96 'signal':61,84,335,355,767 'simba':3,17,27,44,172,179,180,234,248,303,332,336,629,670,700,768,793,875 'simba-optim':171 'simplest':241 'size':163,320,430,687,725,790 'skill':6,102 'skill-dspy-simba-optimizer' 'slower':795 'source-omidzamani' 'statist':33,57 'step':152,157,311,421,678,716,784 'stochast':181 'str':460,461,490,491,502 'strip':548,551 'success':209 'suit':127 'system':232 'task':80 'temperatur':745,750,753,759,814,816 'term':584,586,590,596,598,603,612 'test':832 'text':482 'thorough':845 'thought':836 'thread':660 'tool':454,523 'topic-agent-skills' 'topic-claude-code' 'topic-claude-skills' 'topic-dspy' 'topic-llm' 'topic-prompt-optimization' 'topic-rag' 'trace':291,350,539,645,650 'train':141 'trainset':138,326,327,434,435,626,691,692 'tri':496 'tune':813 'type':131,166 'understand':178 'url':470 'use':9,18,50,64,221,765 'user':12 'variabl':201 'varianc':856 'verbos':383 'weight':68 'word':404 'work':337 'workflow':175","prices":[{"id":"4556dbb3-9b8a-4499-a685-faec6cfd905c","listingId":"1a68afaf-3cf8-4c9d-a82c-7c631cf20bb2","amountUsd":"0","unit":"free","nativeCurrency":null,"nativeAmount":null,"chain":null,"payTo":null,"paymentMethod":"skill-free","isPrimary":true,"details":{"org":"OmidZamani","category":"dspy-skills","install_from":"skills.sh"},"createdAt":"2026-04-18T22:14:18.829Z"}],"sources":[{"listingId":"1a68afaf-3cf8-4c9d-a82c-7c631cf20bb2","source":"github","sourceId":"OmidZamani/dspy-skills/dspy-simba-optimizer","sourceUrl":"https://github.com/OmidZamani/dspy-skills/tree/master/skills/dspy-simba-optimizer","isPrimary":false,"firstSeenAt":"2026-04-18T22:14:18.829Z","lastSeenAt":"2026-05-02T06:55:44.918Z"}],"details":{"listingId":"1a68afaf-3cf8-4c9d-a82c-7c631cf20bb2","quickStartSnippet":null,"exampleRequest":null,"exampleResponse":null,"schema":null,"openapiUrl":null,"agentsTxtUrl":null,"citations":[],"useCases":[],"bestFor":[],"notFor":[],"kindDetails":{"org":"OmidZamani","slug":"dspy-simba-optimizer","github":{"repo":"OmidZamani/dspy-skills","stars":74,"topics":["agent-skills","claude-code","claude-skills","dspy","llm","prompt-optimization","rag"],"license":"mit","html_url":"https://github.com/OmidZamani/dspy-skills","pushed_at":"2026-02-21T12:49:43Z","description":"Collection of Claude Skills for DSPy framework - program language models, optimize prompts, and build RAG pipelines systematically","skill_md_sha":"45b4bd9455f82fa8f0a2fccdfd8f6f30cfc3a024","skill_md_path":"skills/dspy-simba-optimizer/SKILL.md","default_branch":"master","skill_tree_url":"https://github.com/OmidZamani/dspy-skills/tree/master/skills/dspy-simba-optimizer"},"layout":"multi","source":"github","category":"dspy-skills","frontmatter":{"name":"dspy-simba-optimizer","description":"This skill should be used when the user asks to \"optimize with SIMBA\", \"use Bayesian optimization\", \"optimize agents with custom feedback\", mentions \"SIMBA optimizer\", \"mini-batch optimization\", \"statistical optimization\", \"lightweight optimizer\", or needs an alternative to MIPROv2/GEPA for programs with rich feedback signals."},"skills_sh_url":"https://skills.sh/OmidZamani/dspy-skills/dspy-simba-optimizer"},"updatedAt":"2026-05-02T06:55:44.918Z"}}