{"id":"bd43ad76-ea85-4ed1-90c7-47f943824bb3","shortId":"gd3Kv8","kind":"skill","title":"RunPod Serverless GPU Inference","tagline":"Deploy and manage GPU inference endpoints on RunPod Serverless using their REST API. Handles endpoint creation, cold start optimization, request queuing, and auto-scaling configuration for image generation models.","description":"# RunPod Serverless GPU Inference\n\nDeploy and manage GPU inference endpoints on RunPod Serverless using their REST API. Handles endpoint creation, cold start optimization, request queuing, and auto-scaling configuration for image generation models.\n\n## Installation\n\nUse the upstream install or setup path that matches your environment:\n- Make API requests\n\nRequirements and caveats from upstream:\n- The container instances that execute your code when requests arrive at your endpoint. Each worker runs your custom Docker container with your application code and dependencies. Runpod automatically manages worker life...\n- Build and push the worker image to Docker Hub (or another container registry).\n\nBasic usage or getting-started notes:\n- Concepts\n- Manage API keys\n- Agent skills NEW\n\n- Source: https://docs.runpod.io/serverless/overview\n\n## Source\n\n- [Agent Skill Exchange](https://agentskillexchange.com/skills/runpod-serverless-gpu-inference/)","tags":["runpod","serverless","gpu","inference","skills","agentskillexchange","agent-skills","ai-agents","ai-tools","awesome-list","claude-code","codex"],"capabilities":["skill","source-agentskillexchange","skill-runpod-serverless-gpu-inference","topic-agent-skills","topic-ai-agents","topic-ai-tools","topic-awesome-list","topic-claude-code","topic-codex","topic-cursor","topic-llm","topic-mcp","topic-npx-skills","topic-openclaw","topic-skills-catalog"],"categories":["skills"],"synonyms":[],"warnings":[],"endpointUrl":"https://skills.sh/agentskillexchange/skills/runpod-serverless-gpu-inference","protocol":"skill","transport":"skills-sh","auth":{"type":"none","details":{"cli":"npx skills add agentskillexchange/skills","source_repo":"https://github.com/agentskillexchange/skills","install_from":"skills.sh"}},"qualityScore":"0.454","qualityRationale":"deterministic score 0.45 from registry signals: · indexed on github topic:agent-skills · 8 github stars · SKILL.md body (951 chars)","verified":false,"liveness":"unknown","lastLivenessCheck":null,"agentReviews":{"count":0,"score_avg":null,"cost_usd_avg":null,"success_rate":null,"latency_p50_ms":null,"narrative_summary":null,"summary_updated_at":null},"enrichmentModel":"deterministic:skill-github:v1","enrichmentVersion":1,"enrichedAt":"2026-05-18T19:12:15.515Z","embedding":null,"createdAt":"2026-05-18T13:19:04.336Z","updatedAt":"2026-05-18T19:12:15.515Z","lastSeenAt":"2026-05-18T19:12:15.515Z","tsv":"'/serverless/overview':150 '/skills/runpod-serverless-gpu-inference/)':157 'agent':144,152 'agentskillexchange.com':156 'agentskillexchange.com/skills/runpod-serverless-gpu-inference/)':155 'anoth':130 'api':17,51,82,142 'applic':111 'arriv':98 'auto':28,62 'auto-sc':27,61 'automat':116 'basic':133 'build':120 'caveat':86 'code':95,112 'cold':21,55 'concept':140 'configur':30,64 'contain':90,108,131 'creation':20,54 'custom':106 'depend':114 'deploy':5,39 'docker':107,127 'docs.runpod.io':149 'docs.runpod.io/serverless/overview':148 'endpoint':10,19,44,53,101 'environ':80 'exchang':154 'execut':93 'generat':33,67 'get':137 'getting-start':136 'gpu':3,8,37,42 'handl':18,52 'hub':128 'imag':32,66,125 'infer':4,9,38,43 'instal':69,73 'instanc':91 'key':143 'life':119 'make':81 'manag':7,41,117,141 'match':78 'model':34,68 'new':146 'note':139 'optim':23,57 'path':76 'push':122 'queu':25,59 'registri':132 'request':24,58,83,97 'requir':84 'rest':16,50 'run':104 'runpod':1,12,35,46,115 'scale':29,63 'serverless':2,13,36,47 'setup':75 'skill':145,153 'skill-runpod-serverless-gpu-inference' 'sourc':147,151 'source-agentskillexchange' 'start':22,56,138 'topic-agent-skills' 'topic-ai-agents' 'topic-ai-tools' 'topic-awesome-list' 'topic-claude-code' 'topic-codex' 'topic-cursor' 'topic-llm' 'topic-mcp' 'topic-npx-skills' 'topic-openclaw' 'topic-skills-catalog' 'upstream':72,88 'usag':134 'use':14,48,70 'worker':103,118,124","prices":[{"id":"630334f9-811b-4794-8c85-80ec9b34efce","listingId":"bd43ad76-ea85-4ed1-90c7-47f943824bb3","amountUsd":"0","unit":"free","nativeCurrency":null,"nativeAmount":null,"chain":null,"payTo":null,"paymentMethod":"skill-free","isPrimary":true,"details":{"org":"agentskillexchange","category":"skills","install_from":"skills.sh"},"createdAt":"2026-05-18T13:19:04.336Z"}],"sources":[{"listingId":"bd43ad76-ea85-4ed1-90c7-47f943824bb3","source":"github","sourceId":"agentskillexchange/skills/runpod-serverless-gpu-inference","sourceUrl":"https://github.com/agentskillexchange/skills/tree/main/skills/runpod-serverless-gpu-inference","isPrimary":false,"firstSeenAt":"2026-05-18T13:19:04.336Z","lastSeenAt":"2026-05-18T19:12:15.515Z"}],"details":{"listingId":"bd43ad76-ea85-4ed1-90c7-47f943824bb3","quickStartSnippet":null,"exampleRequest":null,"exampleResponse":null,"schema":null,"openapiUrl":null,"agentsTxtUrl":null,"citations":[],"useCases":[],"bestFor":[],"notFor":[],"kindDetails":{"org":"agentskillexchange","slug":"runpod-serverless-gpu-inference","github":{"repo":"agentskillexchange/skills","stars":8,"topics":["agent-skills","ai-agents","ai-tools","awesome-list","claude-code","codex","cursor","llm","mcp","npx-skills","openclaw","skills-catalog"],"license":"mit","html_url":"https://github.com/agentskillexchange/skills","pushed_at":"2026-05-18T19:02:17Z","description":"The open catalog of AI agent skills — 2,000+ security-scanned skills for Claude Code, Cursor, Codex, and more.","skill_md_sha":"9136fa57b32692dcdf4dde357678d562ba67b62b","skill_md_path":"skills/runpod-serverless-gpu-inference/SKILL.md","default_branch":"main","skill_tree_url":"https://github.com/agentskillexchange/skills/tree/main/skills/runpod-serverless-gpu-inference"},"layout":"multi","source":"github","category":"skills","frontmatter":{"name":"RunPod Serverless GPU Inference","description":"Deploy and manage GPU inference endpoints on RunPod Serverless using their REST API. Handles endpoint creation, cold start optimization, request queuing, and auto-scaling configuration for image generation models."},"skills_sh_url":"https://skills.sh/agentskillexchange/skills/runpod-serverless-gpu-inference"},"updatedAt":"2026-05-18T19:12:15.515Z"}}