{"id":"9c8ad9ed-b669-4c5d-a0ad-e848f0e6a72d","shortId":"Gs4Bsk","kind":"skill","title":"podcast-generation","tagline":"Generate real audio narratives from text content using Azure OpenAI's Realtime API.","description":"# Podcast Generation with GPT Realtime Mini\n\nGenerate real audio narratives from text content using Azure OpenAI's Realtime API.\n\n## Quick Start\n\n1. Configure environment variables for Realtime API\n2. Connect via WebSocket to Azure OpenAI Realtime endpoint\n3. Send text prompt, collect PCM audio chunks + transcript\n4. Convert PCM to WAV format\n5. Return base64-encoded audio to frontend for playback\n\n## Environment Configuration\n\n```env\nAZURE_OPENAI_AUDIO_API_KEY=your_realtime_api_key\nAZURE_OPENAI_AUDIO_ENDPOINT=https://your-resource.cognitiveservices.azure.com\nAZURE_OPENAI_AUDIO_DEPLOYMENT=gpt-realtime-mini\n```\n\n**Note**: Endpoint should NOT include `/openai/v1/` - just the base URL.\n\n## Core Workflow\n\n### Backend Audio Generation\n\n```python\nfrom openai import AsyncOpenAI\nimport base64\n\n# Convert HTTPS endpoint to WebSocket URL\nws_url = endpoint.replace(\"https://\", \"wss://\") + \"/openai/v1\"\n\nclient = AsyncOpenAI(\n    websocket_base_url=ws_url,\n    api_key=api_key\n)\n\naudio_chunks = []\ntranscript_parts = []\n\nasync with client.realtime.connect(model=\"gpt-realtime-mini\") as conn:\n    # Configure for audio-only output\n    await conn.session.update(session={\n        \"output_modalities\": [\"audio\"],\n        \"instructions\": \"You are a narrator. Speak naturally.\"\n    })\n    \n    # Send text to narrate\n    await conn.conversation.item.create(item={\n        \"type\": \"message\",\n        \"role\": \"user\",\n        \"content\": [{\"type\": \"input_text\", \"text\": prompt}]\n    })\n    \n    await conn.response.create()\n    \n    # Collect streaming events\n    async for event in conn:\n        if event.type == \"response.output_audio.delta\":\n            audio_chunks.append(base64.b64decode(event.delta))\n        elif event.type == \"response.output_audio_transcript.delta\":\n            transcript_parts.append(event.delta)\n        elif event.type == \"response.done\":\n            break\n\n# Convert PCM to WAV (see scripts/pcm_to_wav.py)\npcm_audio = b''.join(audio_chunks)\nwav_audio = pcm_to_wav(pcm_audio, sample_rate=24000)\n```\n\n### Frontend Audio Playback\n\n```javascript\n// Convert base64 WAV to playable blob\nconst base64ToBlob = (base64, mimeType) => {\n  const bytes = atob(base64);\n  const arr = new Uint8Array(bytes.length);\n  for (let i = 0; i < bytes.length; i++) arr[i] = bytes.charCodeAt(i);\n  return new Blob([arr], { type: mimeType });\n};\n\nconst audioBlob = base64ToBlob(response.audio_data, 'audio/wav');\nconst audioUrl = URL.createObjectURL(audioBlob);\nnew Audio(audioUrl).play();\n```\n\n## Voice Options\n\n| Voice | Character |\n|-------|-----------|\n| alloy | Neutral |\n| echo | Warm |\n| fable | Expressive |\n| onyx | Deep |\n| nova | Friendly |\n| shimmer | Clear |\n\n## Realtime API Events\n\n- `response.output_audio.delta` - Base64 audio chunk\n- `response.output_audio_transcript.delta` - Transcript text\n- `response.done` - Generation complete\n- `error` - Handle with `event.error.message`\n\n## Audio Format\n\n- **Input**: Text prompt\n- **Output**: PCM audio (24kHz, 16-bit, mono)\n- **Storage**: Base64-encoded WAV\n\n## References\n\n- **Full architecture**: See references/architecture.md for complete stack design\n- **Code examples**: See references/code-examples.md for production patterns\n- **PCM conversion**: Use scripts/pcm_to_wav.py for audio format conversion\n\n## When to Use\nThis skill is applicable to execute the workflow or actions described in the overview.\n\n## Limitations\n- Use this skill only when the task clearly matches the scope described above.\n- Do not treat the output as a substitute for environment-specific validation, testing, or expert review.\n- Stop and ask for clarification if required inputs, permissions, safety boundaries, or success criteria are missing.","tags":["podcast","generation","antigravity","awesome","skills","sickn33","agent-skills","agentic-skills","ai-agent-skills","ai-agents","ai-coding","ai-workflows"],"capabilities":["skill","source-sickn33","skill-podcast-generation","topic-agent-skills","topic-agentic-skills","topic-ai-agent-skills","topic-ai-agents","topic-ai-coding","topic-ai-workflows","topic-antigravity","topic-antigravity-skills","topic-claude-code","topic-claude-code-skills","topic-codex-cli","topic-codex-skills"],"categories":["antigravity-awesome-skills"],"synonyms":[],"warnings":[],"endpointUrl":"https://skills.sh/sickn33/antigravity-awesome-skills/podcast-generation","protocol":"skill","transport":"skills-sh","auth":{"type":"none","details":{"cli":"npx skills add sickn33/antigravity-awesome-skills","source_repo":"https://github.com/sickn33/antigravity-awesome-skills","install_from":"skills.sh"}},"qualityScore":"0.700","qualityRationale":"deterministic score 0.70 from registry signals: · indexed on github topic:agent-skills · 34616 github stars · SKILL.md body (3,688 chars)","verified":false,"liveness":"unknown","lastLivenessCheck":null,"agentReviews":{"count":0,"score_avg":null,"cost_usd_avg":null,"success_rate":null,"latency_p50_ms":null,"narrative_summary":null,"summary_updated_at":null},"enrichmentModel":"deterministic:skill-github:v1","enrichmentVersion":1,"enrichedAt":"2026-04-23T00:51:24.145Z","embedding":null,"createdAt":"2026-04-18T21:42:30.345Z","updatedAt":"2026-04-23T00:51:24.145Z","lastSeenAt":"2026-04-23T00:51:24.145Z","tsv":"'/openai/v1':109,135 '0':270 '1':38 '16':340 '2':45 '24000':243 '24khz':339 '3':54 '4':63 '5':69 'action':384 'alloy':302 'api':16,35,44,85,89,143,145,315 'applic':378 'architectur':350 'arr':263,274,281 'ask':422 'async':151,202 'asyncopenai':123,137 'atob':260 'audio':6,25,60,74,84,93,98,117,147,164,172,229,232,235,240,245,295,319,331,338,369 'audio-on':163 'audio/wav':289 'audio_chunks.append':210 'audioblob':285,293 'audiourl':291,296 'await':167,184,197 'azur':12,31,50,82,91,96 'b':230 'backend':116 'base':112,139 'base64':72,125,249,256,261,318,345 'base64-encoded':71,344 'base64.b64decode':211 'base64toblob':255,286 'bit':341 'blob':253,280 'boundari':430 'break':221 'byte':259 'bytes.charcodeat':276 'bytes.length':266,272 'charact':301 'chunk':61,148,233,320 'clarif':424 'clear':313,397 'client':136 'client.realtime.connect':153 'code':357 'collect':58,199 'complet':326,354 'configur':39,80,161 'conn':160,206 'conn.conversation.item.create':185 'conn.response.create':198 'conn.session.update':168 'connect':46 'const':254,258,262,284,290 'content':10,29,191 'convers':365,371 'convert':64,126,222,248 'core':114 'criteria':433 'data':288 'deep':309 'deploy':99 'describ':385,401 'design':356 'echo':304 'elif':213,218 'encod':73,346 'endpoint':53,94,105,128 'endpoint.replace':134 'env':81 'environ':40,79,413 'environment-specif':412 'error':327 'event':201,204,316 'event.delta':212,217 'event.error.message':330 'event.type':208,214,219 'exampl':358 'execut':380 'expert':418 'express':307 'fabl':306 'format':68,332,370 'friend':311 'frontend':76,244 'full':349 'generat':3,4,18,23,118,325 'gpt':20,101,156 'gpt-realtime-mini':100,155 'handl':328 'https':127 'import':122,124 'includ':108 'input':193,333,427 'instruct':173 'item':186 'javascript':247 'join':231 'key':86,90,144,146 'let':268 'limit':389 'match':398 'messag':188 'mimetyp':257,283 'mini':22,103,158 'miss':435 'modal':171 'model':154 'mono':342 'narrat':7,26,177,183 'natur':179 'neutral':303 'new':264,279,294 'note':104 'nova':310 'onyx':308 'openai':13,32,51,83,92,97,121 'option':299 'output':166,170,336,407 'overview':388 'part':150 'pattern':363 'pcm':59,65,223,228,236,239,337,364 'permiss':428 'play':297 'playabl':252 'playback':78,246 'podcast':2,17 'podcast-gener':1 'product':362 'prompt':57,196,335 'python':119 'quick':36 'rate':242 'real':5,24 'realtim':15,21,34,43,52,88,102,157,314 'refer':348 'references/architecture.md':352 'references/code-examples.md':360 'requir':426 'response.audio':287 'response.done':220,324 'response.output_audio.delta':209,317 'response.output_audio_transcript.delta':215,321 'return':70,278 'review':419 'role':189 'safeti':429 'sampl':241 'scope':400 'scripts/pcm_to_wav.py':227,367 'see':226,351,359 'send':55,180 'session':169 'shimmer':312 'skill':376,392 'skill-podcast-generation' 'source-sickn33' 'speak':178 'specif':414 'stack':355 'start':37 'stop':420 'storag':343 'stream':200 'substitut':410 'success':432 'task':396 'test':416 'text':9,28,56,181,194,195,323,334 'topic-agent-skills' 'topic-agentic-skills' 'topic-ai-agent-skills' 'topic-ai-agents' 'topic-ai-coding' 'topic-ai-workflows' 'topic-antigravity' 'topic-antigravity-skills' 'topic-claude-code' 'topic-claude-code-skills' 'topic-codex-cli' 'topic-codex-skills' 'transcript':62,149,322 'transcript_parts.append':216 'treat':405 'type':187,192,282 'uint8array':265 'url':113,131,133,140,142 'url.createobjecturl':292 'use':11,30,366,374,390 'user':190 'valid':415 'variabl':41 'via':47 'voic':298,300 'warm':305 'wav':67,225,234,238,250,347 'websocket':48,130,138 'workflow':115,382 'ws':132,141 'your-resource.cognitiveservices.azure.com':95","prices":[{"id":"85d2fa23-2a40-4dca-927f-a73938795c38","listingId":"9c8ad9ed-b669-4c5d-a0ad-e848f0e6a72d","amountUsd":"0","unit":"free","nativeCurrency":null,"nativeAmount":null,"chain":null,"payTo":null,"paymentMethod":"skill-free","isPrimary":true,"details":{"org":"sickn33","category":"antigravity-awesome-skills","install_from":"skills.sh"},"createdAt":"2026-04-18T21:42:30.345Z"}],"sources":[{"listingId":"9c8ad9ed-b669-4c5d-a0ad-e848f0e6a72d","source":"github","sourceId":"sickn33/antigravity-awesome-skills/podcast-generation","sourceUrl":"https://github.com/sickn33/antigravity-awesome-skills/tree/main/skills/podcast-generation","isPrimary":false,"firstSeenAt":"2026-04-18T21:42:30.345Z","lastSeenAt":"2026-04-23T00:51:24.145Z"}],"details":{"listingId":"9c8ad9ed-b669-4c5d-a0ad-e848f0e6a72d","quickStartSnippet":null,"exampleRequest":null,"exampleResponse":null,"schema":null,"openapiUrl":null,"agentsTxtUrl":null,"citations":[],"useCases":[],"bestFor":[],"notFor":[],"kindDetails":{"org":"sickn33","slug":"podcast-generation","github":{"repo":"sickn33/antigravity-awesome-skills","stars":34616,"topics":["agent-skills","agentic-skills","ai-agent-skills","ai-agents","ai-coding","ai-workflows","antigravity","antigravity-skills","claude-code","claude-code-skills","codex-cli","codex-skills","cursor","cursor-skills","developer-tools","gemini-cli","gemini-skills","kiro","mcp","skill-library"],"license":"mit","html_url":"https://github.com/sickn33/antigravity-awesome-skills","pushed_at":"2026-04-22T06:40:00Z","description":"Installable GitHub library of 1,400+ agentic skills for Claude Code, Cursor, Codex CLI, Gemini CLI, Antigravity, and more. Includes installer CLI, bundles, workflows, and official/community skill collections.","skill_md_sha":"da3f22c04d9c1b414ba5265424729fdde0e57858","skill_md_path":"skills/podcast-generation/SKILL.md","default_branch":"main","skill_tree_url":"https://github.com/sickn33/antigravity-awesome-skills/tree/main/skills/podcast-generation"},"layout":"multi","source":"github","category":"antigravity-awesome-skills","frontmatter":{"name":"podcast-generation","description":"Generate real audio narratives from text content using Azure OpenAI's Realtime API."},"skills_sh_url":"https://skills.sh/sickn33/antigravity-awesome-skills/podcast-generation"},"updatedAt":"2026-04-23T00:51:24.145Z"}}