{"id":"15e3ad90-fed3-4568-9fd5-d82f589f7da6","shortId":"sWBvJf","kind":"skill","title":"proteinmpnn","tagline":"Design protein sequences using ProteinMPNN inverse folding. Use this skill when: (1) Designing sequences for RFdiffusion backbones, (2) Redesigning existing protein sequences, (3) Fixing specific residues while designing others, (4) Optimizing sequences for expression or stabilit","description":"# ProteinMPNN Sequence Design\n\n## Prerequisites\n\n| Requirement | Minimum | Recommended |\n|-------------|---------|-------------|\n| Python | 3.8+ | 3.10 |\n| CUDA | 11.0+ | 11.7+ |\n| GPU VRAM | 8GB | 16GB (T4) |\n| RAM | 8GB | 16GB |\n\n## How to run\n\n> **First time?** See [Installation Guide](../../docs/installation.md) to set up Modal and biomodals.\n\n### Option 1: Local installation (recommended)\n```bash\ngit clone https://github.com/dauparas/ProteinMPNN.git\ncd ProteinMPNN\n\npython protein_mpnn_run.py \\\n  --pdb_path backbone.pdb \\\n  --out_folder output/ \\\n  --num_seq_per_target 16 \\\n  --sampling_temp \"0.1\"\n```\n\n**GPU**: T4 (16GB) sufficient | **Time**: ~50-100 sequences/minute\n\n### Option 2: Modal (via LigandMPNN wrapper)\n```bash\ncd biomodals\nmodal run modal_ligandmpnn.py \\\n  --pdb-path backbone.pdb \\\n  --num-seq-per-target 16\n```\n\nNote: LigandMPNN includes ProteinMPNN functionality.\n\n## Config Schema\n\n### Core Parameters\n\n| Parameter | Default | Range | Description |\n|-----------|---------|-------|-------------|\n| `--pdb_path` | required | path | Single PDB input |\n| `--pdb_path_chains` | all | A,B | Chains to design (comma-sep) |\n| `--out_folder` | required | path | Output directory |\n| `--num_seq_per_target` | 1 | 1-1000 | Sequences per structure |\n| `--sampling_temp` | \"0.1\" | \"0.0001-1.0\" | Temperature (string!) |\n| `--seed` | 0 | int | Random seed |\n| `--batch_size` | 1 | 1-32 | Batch size |\n\n### Temperature Guide\n```\n0.1  -> Low diversity, high recovery (production)\n0.2  -> Moderate diversity (default)\n0.3  -> Higher diversity (exploration)\n0.5+ -> Very diverse, lower quality\n```\n\n**IMPORTANT**: Temperature must be passed as a string, not float.\n\n## Common mistakes\n\n### Temperature Parameter\n✅ **Correct**:\n```bash\n--sampling_temp \"0.1\"    # String with quotes\n```\n\n❌ **Wrong**:\n```bash\n--sampling_temp 0.1      # Float without quotes - may cause errors\n--sampling_temp 0.1,0.2  # Multiple temps need proper format\n```\n\n### Fixed Positions JSONL\n✅ **Correct**:\n```json\n{\"A\": [1, 2, 3, 10, 11], \"B\": [5, 6]}\n```\n\n❌ **Wrong**:\n```json\n{\"A\": \"1,2,3,10,11\"}     # String instead of list\n{A: [1, 2, 3]}           # Missing quotes on key\n{\"A\": [1,2,3,]}          # Trailing comma\n```\n\n### Chain Selection\n✅ **Correct**:\n```bash\n--pdb_path_chains A,B    # No spaces\n```\n\n❌ **Wrong**:\n```bash\n--pdb_path_chains A, B   # Space after comma\n--pdb_path_chains \"A,B\"  # Quotes may cause issues\n```\n\n### Amino Acid Biases\n```bash\n# Bias toward certain AAs (positive = favor)\n--bias_AA_jsonl '{\"A\": {\"A\": 1.5, \"W\": -2.0}}'\n\n# Omit specific AAs globally\n--omit_AAs \"CM\"  # No cysteine or methionine\n\n# Per-position omission\n--omit_AA_jsonl '{\"A\": {\"1\": \"C\", \"2\": \"CM\"}}'\n```\n\n### Multi-Chain Design\n```bash\n# Design chains A and B together\n--pdb_path_chains A,B\n\n# Tie chains (same sequence)\n--tied_positions_jsonl tied.jsonl\n```\n\n## Variants Comparison\n\n| Variant | Use Case | Key Difference |\n|---------|----------|----------------|\n| ProteinMPNN | General | Original model |\n| SolubleMPNN | Expression | Trained on soluble proteins |\n| LigandMPNN | Small molecules | Ligand-aware context |\n\n## Output format\n\n```\noutput/\n├── seqs/\n│   └── backbone.fa          # FASTA sequences\n└── backbone_pdb/\n    └── backbone_0001.pdb    # PDBs with designed sequence\n```\n\n### FASTA Header Format\n```\n>backbone_0001, score=1.234, global_score=1.234, seq_recovery=0.85\nMKTAYIAKQRQISFVKSHFSRQLE...\n```\n\n## Common workflows\n\n### Binder Sequence Design\n```bash\npython protein_mpnn_run.py \\\n  --pdb_path binder_backbone.pdb \\\n  --out_folder output/ \\\n  --num_seq_per_target 16 \\\n  --sampling_temp \"0.1\" \\\n  --pdb_path_chains B  # Design binder chain only\n```\n\n### Interface Redesign\n```bash\n# Fix core, design interface\npython protein_mpnn_run.py \\\n  --pdb_path complex.pdb \\\n  --fixed_positions_jsonl core_positions.jsonl \\\n  --num_seq_per_target 32\n```\n\n### Multi-State Design\n```bash\n# Design for multiple conformations\npython protein_mpnn_run.py \\\n  --pdb_path_multi state1.pdb,state2.pdb \\\n  --num_seq_per_target 16\n```\n\n## Sample output\n\n### Successful run\n```\n$ python protein_mpnn_run.py --pdb_path backbone.pdb --out_folder output/ --num_seq_per_target 8\nLoading model weights...\nDesigning sequences for backbone.pdb\nGenerated 8 sequences in 2.3 seconds\n\noutput/seqs/backbone.fa:\n>backbone_0001, score=1.234, global_score=1.189, seq_recovery=0.82\nMKTAYIAKQRQISFVKSHFSRQLEERGLTKE...\n>backbone_0002, score=1.198, global_score=1.156, seq_recovery=0.79\nMKTAYIAKQRQISFVKSQFSRQLDERGLTKE...\n```\n\n**What good output looks like:**\n- Score: 1.0-2.0 (lower = more confident)\n- Seq recovery: 0.3-0.6 for de novo, 0.7-0.9 for redesign\n- Diverse sequences (not all identical) when temp > 0.1\n\n## Decision tree\n\n```\nShould I use ProteinMPNN?\n│\n├─ Have a backbone structure?\n│  ├─ Yes → Continue below\n│  └─ No → Use RFdiffusion first\n│\n├─ What's in the binding site?\n│  ├─ Nothing / protein only → ProteinMPNN ✓\n│  ├─ Small molecule / ligand → Use LigandMPNN\n│  └─ Metal / cofactor → Use LigandMPNN\n│\n├─ Priority?\n│  ├─ Solubility/expression → Consider SolubleMPNN\n│  ├─ Speed → ProteinMPNN ✓\n│  └─ AF2 optimization → Consider ColabDesign\n│\n└─ Need fixed positions?\n   ├─ Yes → Use --fixed_positions_jsonl\n   └─ No → ProteinMPNN ✓ (design all)\n```\n\n## Typical performance\n\n| Campaign Size | Time (T4) | Cost (Modal) | Notes |\n|---------------|-----------|--------------|-------|\n| 100 backbones × 8 seq | 15-20 min | ~$2 | Standard |\n| 500 backbones × 8 seq | 1-1.5h | ~$8 | Large campaign |\n| 1000 backbones × 16 seq | 3-4h | ~$18 | Comprehensive |\n\n**Throughput**: ~50-100 sequences/minute on T4 GPU.\n\n---\n\n## Verify\n\n```bash\ngrep -c \"^>\" output/seqs/*.fa  # Should match backbone_count × num_seq_per_target\n```\n\n---\n\n## Troubleshooting\n\n**Low sequence diversity**: Increase sampling_temp to 0.2-0.3\n**Poor recovery**: Decrease sampling_temp to 0.1\n**OOM errors**: Reduce batch_size\n**Unwanted cysteines**: Use --omit_AAs \"C\"\n\n### Error interpretation\n\n| Error | Cause | Fix |\n|-------|-------|-----|\n| `RuntimeError: CUDA out of memory` | Long protein or large batch | Reduce batch_size or use larger GPU |\n| `KeyError: 'A'` | Chain not in PDB | Check chain IDs in your PDB file |\n| `JSONDecodeError` | Invalid JSONL format | Validate JSON syntax (see Common Mistakes) |\n| `IndexError: list index` | Empty chain or residue list | Check PDB has atoms, not just HEADER |\n\n---\n\n**Next**: Structure prediction for validation → `protein-qc` for filtering.","tags":["proteinmpnn","protein","design","skills","adaptyvbio","agent-skills","claude-code","protein-design","protein-engineering"],"capabilities":["skill","source-adaptyvbio","skill-proteinmpnn","topic-agent-skills","topic-claude-code","topic-protein-design","topic-protein-engineering"],"categories":["protein-design-skills"],"synonyms":[],"warnings":[],"endpointUrl":"https://skills.sh/adaptyvbio/protein-design-skills/proteinmpnn","protocol":"skill","transport":"skills-sh","auth":{"type":"none","details":{"cli":"npx skills add adaptyvbio/protein-design-skills","source_repo":"https://github.com/adaptyvbio/protein-design-skills","install_from":"skills.sh"}},"qualityScore":"0.513","qualityRationale":"deterministic score 0.51 from registry signals: · indexed on github topic:agent-skills · 126 github stars · SKILL.md body (6,447 chars)","verified":false,"liveness":"unknown","lastLivenessCheck":null,"agentReviews":{"count":0,"score_avg":null,"cost_usd_avg":null,"success_rate":null,"latency_p50_ms":null,"narrative_summary":null,"summary_updated_at":null},"enrichmentModel":"deterministic:skill-github:v1","enrichmentVersion":1,"enrichedAt":"2026-05-02T12:54:49.044Z","embedding":null,"createdAt":"2026-04-18T22:10:13.043Z","updatedAt":"2026-05-02T12:54:49.044Z","lastSeenAt":"2026-05-02T12:54:49.044Z","tsv":"'-0.3':730 '-0.6':589 '-0.9':594 '-1.0':185 '-1.5':686 '-100':109,702 '-1000':177 '-2.0':350,582 '-20':677 '-32':197 '-4':696 '/../docs/installation.md':67 '/dauparas/proteinmpnn.git':84 '0':189 '0.0001':184 '0.1':102,183,202,239,247,256,471,604,737 '0.2':208,257,729 '0.3':212,588 '0.5':216 '0.7':593 '0.79':573 '0.82':562 '0.85':448 '0001':440,554 '0002':565 '1':13,75,175,176,195,196,269,280,290,298,370,685 '1.0':581 '1.156':570 '1.189':559 '1.198':567 '1.234':442,445,556 '1.5':348 '10':272,283 '100':672 '1000':691 '11':273,284 '11.0':49 '11.7':50 '15':676 '16':99,132,468,521,693 '16gb':54,58,105 '18':698 '2':19,112,270,281,291,299,372,679 '2.3':550 '3':24,271,282,292,300,695 '3.10':47 '3.8':46 '32':500 '4':31 '5':275 '50':108,701 '500':681 '6':276 '8':538,547,674,683,688 '8gb':53,57 'aa':340,344,353,356,367,747 'acid':334 'af2':647 'amino':333 'atom':805 'awar':420 'b':158,274,311,320,328,383,389,475 'backbon':18,429,439,553,564,613,673,682,692,715 'backbone.fa':426 'backbone.pdb':91,126,530,545 'backbone_0001.pdb':431 'bash':79,117,236,244,306,315,336,378,455,482,505,708 'batch':193,198,741,763,765 'bias':335,337,343 'bind':626 'binder':452,477 'binder_backbone.pdb':460 'biomod':73,119 'c':371,710,748 'campaign':665,690 'case':402 'caus':252,331,752 'cd':85,118 'certain':339 'chain':155,159,303,309,318,326,376,380,387,391,474,478,773,778,798 'check':777,802 'clone':81 'cm':357,373 'cofactor':638 'colabdesign':650 'comma':163,302,323 'comma-sep':162 'common':231,450,792 'comparison':399 'complex.pdb':491 'comprehens':699 'confid':585 'config':138 'conform':509 'consid':643,649 'context':421 'continu':616 'core':140,484 'core_positions.jsonl':495 'correct':235,266,305 'cost':669 'count':716 'cuda':48,755 'cystein':359,744 'de':591 'decis':605 'decreas':733 'default':143,211 'descript':145 'design':2,14,29,40,161,377,379,434,454,476,485,504,506,542,661 'differ':404 'directori':170 'divers':204,210,214,218,597,724 'empti':797 'error':253,739,749,751 'exist':21 'explor':215 'express':35,410 'fa':712 'fasta':427,436 'favor':342 'file':783 'filter':818 'first':62,621 'fix':25,263,483,492,652,656,753 'float':230,248 'fold':8 'folder':93,166,462,532 'format':262,423,438,787 'function':137 'general':406 'generat':546 'git':80 'github.com':83 'github.com/dauparas/proteinmpnn.git':82 'global':354,443,557,568 'good':576 'gpu':51,103,706,770 'grep':709 'guid':66,201 'h':687,697 'header':437,808 'high':205 'higher':213 'id':779 'ident':601 'import':221 'includ':135 'increas':725 'index':796 'indexerror':794 'input':152 'instal':65,77 'instead':286 'int':190 'interfac':480,486 'interpret':750 'invalid':785 'invers':7 'issu':332 'json':267,278,789 'jsondecodeerror':784 'jsonl':265,345,368,396,494,658,786 'key':296,403 'keyerror':771 'larg':689,762 'larger':769 'ligand':419,634 'ligand-awar':418 'ligandmpnn':115,134,415,636,640 'like':579 'list':288,795,801 'load':539 'local':76 'long':759 'look':578 'low':203,722 'lower':219,583 'match':714 'may':251,330 'memori':758 'metal':637 'methionin':361 'min':678 'minimum':43 'miss':293 'mistak':232,793 'mktayiakqrqisfvkshfsrql':449 'mktayiakqrqisfvkshfsrqleergltk':563 'mktayiakqrqisfvksqfsrqldergltk':574 'modal':71,113,120,670 'modal_ligandmpnn.py':122 'model':408,540 'moder':209 'molecul':417,633 'multi':375,502,514 'multi-chain':374 'multi-st':501 'multipl':258,508 'must':223 'need':260,651 'next':809 'note':133,671 'noth':628 'novo':592 'num':95,128,171,464,496,517,534,717 'num-seq-per-target':127 'omiss':365 'omit':351,355,366,746 'oom':738 'optim':32,648 'option':74,111 'origin':407 'other':30 'output':94,169,422,424,463,523,533,577 'output/seqs':711 'output/seqs/backbone.fa':552 'paramet':141,142,234 'pass':225 'path':90,125,147,149,154,168,308,317,325,386,459,473,490,513,529 'pdb':89,124,146,151,153,307,316,324,385,430,458,472,489,512,528,776,782,803 'pdb-path':123 'pdbs':432 'per':97,130,173,179,363,466,498,519,536,719 'per-posit':362 'perform':664 'poor':731 'posit':264,341,364,395,493,653,657 'predict':811 'prerequisit':41 'prioriti':641 'product':207 'proper':261 'protein':3,22,414,629,760,815 'protein-qc':814 'protein_mpnn_run.py':88,457,488,511,527 'proteinmpnn':1,6,38,86,136,405,610,631,646,660 'python':45,87,456,487,510,526 'qc':816 'qualiti':220 'quot':242,250,294,329 'ram':56 'random':191 'rang':144 'recommend':44,78 'recoveri':206,447,561,572,587,732 'redesign':20,481,596 'reduc':740,764 'requir':42,148,167 'residu':27,800 'rfdiffus':17,620 'run':61,121,525 'runtimeerror':754 'sampl':100,181,237,245,254,469,522,726,734 'schema':139 'score':441,444,555,558,566,569,580 'second':551 'see':64,791 'seed':188,192 'select':304 'sep':164 'seq':96,129,172,425,446,465,497,518,535,560,571,586,675,684,694,718 'sequenc':4,15,23,33,39,178,393,428,435,453,543,548,598,723 'sequences/minute':110,703 'set':69 'singl':150 'site':627 'size':194,199,666,742,766 'skill':11 'skill-proteinmpnn' 'small':416,632 'solubility/expression':642 'solubl':413 'solublempnn':409,644 'source-adaptyvbio' 'space':313,321 'specif':26,352 'speed':645 'stabilit':37 'standard':680 'state':503 'state1.pdb':515 'state2.pdb':516 'string':187,228,240,285 'structur':180,614,810 'success':524 'suffici':106 'syntax':790 't4':55,104,668,705 'target':98,131,174,467,499,520,537,720 'temp':101,182,238,246,255,259,470,603,727,735 'temperatur':186,200,222,233 'throughput':700 'tie':390,394 'tied.jsonl':397 'time':63,107,667 'togeth':384 'topic-agent-skills' 'topic-claude-code' 'topic-protein-design' 'topic-protein-engineering' 'toward':338 'trail':301 'train':411 'tree':606 'troubleshoot':721 'typic':663 'unwant':743 'use':5,9,401,609,619,635,639,655,745,768 'valid':788,813 'variant':398,400 'verifi':707 'via':114 'vram':52 'w':349 'weight':541 'without':249 'workflow':451 'wrapper':116 'wrong':243,277,314 'yes':615,654","prices":[{"id":"6eb88376-482c-4571-bfb5-3811dd6333d3","listingId":"15e3ad90-fed3-4568-9fd5-d82f589f7da6","amountUsd":"0","unit":"free","nativeCurrency":null,"nativeAmount":null,"chain":null,"payTo":null,"paymentMethod":"skill-free","isPrimary":true,"details":{"org":"adaptyvbio","category":"protein-design-skills","install_from":"skills.sh"},"createdAt":"2026-04-18T22:10:13.043Z"}],"sources":[{"listingId":"15e3ad90-fed3-4568-9fd5-d82f589f7da6","source":"github","sourceId":"adaptyvbio/protein-design-skills/proteinmpnn","sourceUrl":"https://github.com/adaptyvbio/protein-design-skills/tree/main/skills/proteinmpnn","isPrimary":false,"firstSeenAt":"2026-04-18T22:10:13.043Z","lastSeenAt":"2026-05-02T12:54:49.044Z"}],"details":{"listingId":"15e3ad90-fed3-4568-9fd5-d82f589f7da6","quickStartSnippet":null,"exampleRequest":null,"exampleResponse":null,"schema":null,"openapiUrl":null,"agentsTxtUrl":null,"citations":[],"useCases":[],"bestFor":[],"notFor":[],"kindDetails":{"org":"adaptyvbio","slug":"proteinmpnn","github":{"repo":"adaptyvbio/protein-design-skills","stars":126,"topics":["agent-skills","claude-code","protein-design","protein-engineering"],"license":"mit","html_url":"https://github.com/adaptyvbio/protein-design-skills","pushed_at":"2026-01-19T13:06:29Z","description":"Claude Code skills for protein design","skill_md_sha":"4e2fed68258e45b444212238ea5a8c77446ed8b3","skill_md_path":"skills/proteinmpnn/SKILL.md","default_branch":"main","skill_tree_url":"https://github.com/adaptyvbio/protein-design-skills/tree/main/skills/proteinmpnn"},"layout":"multi","source":"github","category":"protein-design-skills","frontmatter":{"name":"proteinmpnn","license":"MIT","description":"Design protein sequences using ProteinMPNN inverse folding. Use this skill when: (1) Designing sequences for RFdiffusion backbones, (2) Redesigning existing protein sequences, (3) Fixing specific residues while designing others, (4) Optimizing sequences for expression or stability, (5) Multi-state or negative design.  For backbone generation, use rfdiffusion or bindcraft. For ligand-aware design, use ligandmpnn. For solubility optimization, use solublempnn."},"skills_sh_url":"https://skills.sh/adaptyvbio/protein-design-skills/proteinmpnn"},"updatedAt":"2026-05-02T12:54:49.044Z"}}