{"id":"5a817f7b-7aac-4c91-b850-f524d10e007c","shortId":"7nGkbD","kind":"skill","title":"azure-ai-voicelive-py","tagline":"Build real-time voice AI applications with bidirectional WebSocket communication.","description":"# Azure AI Voice Live SDK\n\nBuild real-time voice AI applications with bidirectional WebSocket communication.\n\n## Installation\n\n```bash\npip install azure-ai-voicelive aiohttp azure-identity\n```\n\n## Environment Variables\n\n```bash\nAZURE_COGNITIVE_SERVICES_ENDPOINT=https://<region>.api.cognitive.microsoft.com\n# For API key auth (not recommended for production)\nAZURE_COGNITIVE_SERVICES_KEY=<api-key>\n```\n\n## Authentication\n\n**DefaultAzureCredential (preferred)**:\n```python\nfrom azure.ai.voicelive.aio import connect\nfrom azure.identity.aio import DefaultAzureCredential\n\nasync with connect(\n    endpoint=os.environ[\"AZURE_COGNITIVE_SERVICES_ENDPOINT\"],\n    credential=DefaultAzureCredential(),\n    model=\"gpt-4o-realtime-preview\",\n    credential_scopes=[\"https://cognitiveservices.azure.com/.default\"]\n) as conn:\n    ...\n```\n\n**API Key**:\n```python\nfrom azure.ai.voicelive.aio import connect\nfrom azure.core.credentials import AzureKeyCredential\n\nasync with connect(\n    endpoint=os.environ[\"AZURE_COGNITIVE_SERVICES_ENDPOINT\"],\n    credential=AzureKeyCredential(os.environ[\"AZURE_COGNITIVE_SERVICES_KEY\"]),\n    model=\"gpt-4o-realtime-preview\"\n) as conn:\n    ...\n```\n\n## Quick Start\n\n```python\nimport asyncio\nimport os\nfrom azure.ai.voicelive.aio import connect\nfrom azure.identity.aio import DefaultAzureCredential\n\nasync def main():\n    async with connect(\n        endpoint=os.environ[\"AZURE_COGNITIVE_SERVICES_ENDPOINT\"],\n        credential=DefaultAzureCredential(),\n        model=\"gpt-4o-realtime-preview\",\n        credential_scopes=[\"https://cognitiveservices.azure.com/.default\"]\n    ) as conn:\n        # Update session with instructions\n        await conn.session.update(session={\n            \"instructions\": \"You are a helpful assistant.\",\n            \"modalities\": [\"text\", \"audio\"],\n            \"voice\": \"alloy\"\n        })\n        \n        # Listen for events\n        async for event in conn:\n            print(f\"Event: {event.type}\")\n            if event.type == \"response.audio_transcript.done\":\n                print(f\"Transcript: {event.transcript}\")\n            elif event.type == \"response.done\":\n                break\n\nasyncio.run(main())\n```\n\n## Core Architecture\n\n### Connection Resources\n\nThe `VoiceLiveConnection` exposes these resources:\n\n| Resource | Purpose | Key Methods |\n|----------|---------|-------------|\n| `conn.session` | Session configuration | `update(session=...)` |\n| `conn.response` | Model responses | `create()`, `cancel()` |\n| `conn.input_audio_buffer` | Audio input | `append()`, `commit()`, `clear()` |\n| `conn.output_audio_buffer` | Audio output | `clear()` |\n| `conn.conversation` | Conversation state | `item.create()`, `item.delete()`, `item.truncate()` |\n| `conn.transcription_session` | Transcription config | `update(session=...)` |\n\n## Session Configuration\n\n```python\nfrom azure.ai.voicelive.models import RequestSession, FunctionTool\n\nawait conn.session.update(session=RequestSession(\n    instructions=\"You are a helpful voice assistant.\",\n    modalities=[\"text\", \"audio\"],\n    voice=\"alloy\",  # or \"echo\", \"shimmer\", \"sage\", etc.\n    input_audio_format=\"pcm16\",\n    output_audio_format=\"pcm16\",\n    turn_detection={\n        \"type\": \"server_vad\",\n        \"threshold\": 0.5,\n        \"prefix_padding_ms\": 300,\n        \"silence_duration_ms\": 500\n    },\n    tools=[\n        FunctionTool(\n            type=\"function\",\n            name=\"get_weather\",\n            description=\"Get current weather\",\n            parameters={\n                \"type\": \"object\",\n                \"properties\": {\n                    \"location\": {\"type\": \"string\"}\n                },\n                \"required\": [\"location\"]\n            }\n        )\n    ]\n))\n```\n\n## Audio Streaming\n\n### Send Audio (Base64 PCM16)\n\n```python\nimport base64\n\n# Read audio chunk (16-bit PCM, 24kHz mono)\naudio_chunk = await read_audio_from_microphone()\nb64_audio = base64.b64encode(audio_chunk).decode()\n\nawait conn.input_audio_buffer.append(audio=b64_audio)\n```\n\n### Receive Audio\n\n```python\nasync for event in conn:\n    if event.type == \"response.audio.delta\":\n        audio_bytes = base64.b64decode(event.delta)\n        await play_audio(audio_bytes)\n    elif event.type == \"response.audio.done\":\n        print(\"Audio complete\")\n```\n\n## Event Handling\n\n```python\nasync for event in conn:\n    match event.type:\n        # Session events\n        case \"session.created\":\n            print(f\"Session: {event.session}\")\n        case \"session.updated\":\n            print(\"Session updated\")\n        \n        # Audio input events\n        case \"input_audio_buffer.speech_started\":\n            print(f\"Speech started at {event.audio_start_ms}ms\")\n        case \"input_audio_buffer.speech_stopped\":\n            print(f\"Speech stopped at {event.audio_end_ms}ms\")\n        \n        # Transcription events\n        case \"conversation.item.input_audio_transcription.completed\":\n            print(f\"User said: {event.transcript}\")\n        case \"conversation.item.input_audio_transcription.delta\":\n            print(f\"Partial: {event.delta}\")\n        \n        # Response events\n        case \"response.created\":\n            print(f\"Response started: {event.response.id}\")\n        case \"response.audio_transcript.delta\":\n            print(event.delta, end=\"\", flush=True)\n        case \"response.audio.delta\":\n            audio = base64.b64decode(event.delta)\n        case \"response.done\":\n            print(f\"Response complete: {event.response.status}\")\n        \n        # Function calls\n        case \"response.function_call_arguments.done\":\n            result = handle_function(event.name, event.arguments)\n            await conn.conversation.item.create(item={\n                \"type\": \"function_call_output\",\n                \"call_id\": event.call_id,\n                \"output\": json.dumps(result)\n            })\n            await conn.response.create()\n        \n        # Errors\n        case \"error\":\n            print(f\"Error: {event.error.message}\")\n```\n\n## Common Patterns\n\n### Manual Turn Mode (No VAD)\n\n```python\nawait conn.session.update(session={\"turn_detection\": None})\n\n# Manually control turns\nawait conn.input_audio_buffer.append(audio=b64_audio)\nawait conn.input_audio_buffer.commit()  # End of user turn\nawait conn.response.create()  # Trigger response\n```\n\n### Interrupt Handling\n\n```python\nasync for event in conn:\n    if event.type == \"input_audio_buffer.speech_started\":\n        # User interrupted - cancel current response\n        await conn.response.cancel()\n        await conn.output_audio_buffer.clear()\n```\n\n### Conversation History\n\n```python\n# Add system message\nawait conn.conversation.item.create(item={\n    \"type\": \"message\",\n    \"role\": \"system\",\n    \"content\": [{\"type\": \"input_text\", \"text\": \"Be concise.\"}]\n})\n\n# Add user message\nawait conn.conversation.item.create(item={\n    \"type\": \"message\",\n    \"role\": \"user\", \n    \"content\": [{\"type\": \"input_text\", \"text\": \"Hello!\"}]\n})\n\nawait conn.response.create()\n```\n\n## Voice Options\n\n| Voice | Description |\n|-------|-------------|\n| `alloy` | Neutral, balanced |\n| `echo` | Warm, conversational |\n| `shimmer` | Clear, professional |\n| `sage` | Calm, authoritative |\n| `coral` | Friendly, upbeat |\n| `ash` | Deep, measured |\n| `ballad` | Expressive |\n| `verse` | Storytelling |\n\nAzure voices: Use `AzureStandardVoice`, `AzureCustomVoice`, or `AzurePersonalVoice` models.\n\n## Audio Formats\n\n| Format | Sample Rate | Use Case |\n|--------|-------------|----------|\n| `pcm16` | 24kHz | Default, high quality |\n| `pcm16-8000hz` | 8kHz | Telephony |\n| `pcm16-16000hz` | 16kHz | Voice assistants |\n| `g711_ulaw` | 8kHz | Telephony (US) |\n| `g711_alaw` | 8kHz | Telephony (EU) |\n\n## Turn Detection Options\n\n```python\n# Server VAD (default)\n{\"type\": \"server_vad\", \"threshold\": 0.5, \"silence_duration_ms\": 500}\n\n# Azure Semantic VAD (smarter detection)\n{\"type\": \"azure_semantic_vad\"}\n{\"type\": \"azure_semantic_vad_en\"}  # English optimized\n{\"type\": \"azure_semantic_vad_multilingual\"}\n```\n\n## Error Handling\n\n```python\nfrom azure.ai.voicelive.aio import ConnectionError, ConnectionClosed\n\ntry:\n    async with connect(...) as conn:\n        async for event in conn:\n            if event.type == \"error\":\n                print(f\"API Error: {event.error.code} - {event.error.message}\")\nexcept ConnectionClosed as e:\n    print(f\"Connection closed: {e.code} - {e.reason}\")\nexcept ConnectionError as e:\n    print(f\"Connection error: {e}\")\n```\n\n## References\n\n- **Detailed API Reference**: See references/api-reference.md\n- **Complete Examples**: See references/examples.md\n- **All Models & Types**: See references/models.md\n\n## When to Use\nThis skill is applicable to execute the workflow or actions described in the overview.\n\n## Limitations\n- Use this skill only when the task clearly matches the scope described above.\n- Do not treat the output as a substitute for environment-specific validation, testing, or expert review.\n- Stop and ask for clarification if required inputs, permissions, safety boundaries, or success criteria are missing.","tags":["azure","voicelive","antigravity","awesome","skills","sickn33","agent-skills","agentic-skills","ai-agent-skills","ai-agents","ai-coding","ai-workflows"],"capabilities":["skill","source-sickn33","skill-azure-ai-voicelive-py","topic-agent-skills","topic-agentic-skills","topic-ai-agent-skills","topic-ai-agents","topic-ai-coding","topic-ai-workflows","topic-antigravity","topic-antigravity-skills","topic-claude-code","topic-claude-code-skills","topic-codex-cli","topic-codex-skills"],"categories":["antigravity-awesome-skills"],"synonyms":[],"warnings":[],"endpointUrl":"https://skills.sh/sickn33/antigravity-awesome-skills/azure-ai-voicelive-py","protocol":"skill","transport":"skills-sh","auth":{"type":"none","details":{"cli":"npx skills add sickn33/antigravity-awesome-skills","source_repo":"https://github.com/sickn33/antigravity-awesome-skills","install_from":"skills.sh"}},"qualityScore":"0.700","qualityRationale":"deterministic score 0.70 from registry signals: · indexed on github topic:agent-skills · 34928 github stars · SKILL.md body (8,982 chars)","verified":false,"liveness":"unknown","lastLivenessCheck":null,"agentReviews":{"count":0,"score_avg":null,"cost_usd_avg":null,"success_rate":null,"latency_p50_ms":null,"narrative_summary":null,"summary_updated_at":null},"enrichmentModel":"deterministic:skill-github:v1","enrichmentVersion":1,"enrichedAt":"2026-04-24T18:50:29.013Z","embedding":null,"createdAt":"2026-04-18T21:32:07.916Z","updatedAt":"2026-04-24T18:50:29.013Z","lastSeenAt":"2026-04-24T18:50:29.013Z","tsv":"'/.default':98,175 '0.5':313,697 '16':354 '16000hz':672 '16khz':673 '24khz':357,661 '300':317 '4o':91,131,168 '500':321,701 '8000hz':667 '8khz':668,678,683 'action':797 'add':584,601 'ai':3,11,18,27,39 'aiohttp':41 'alaw':682 'alloy':195,293,623 'api':54,101,747,772 'api.cognitive.microsoft.com':52 'append':249 'applic':12,28,791 'architectur':222 'ash':638 'ask':835 'assist':190,288,675 'async':77,112,151,154,199,380,406,563,732,737 'asyncio':140 'asyncio.run':219 'audio':193,245,247,253,255,291,300,304,342,345,352,359,363,367,369,374,376,378,388,394,395,401,426,486,547,549,653 'auth':56 'authent':65 'authorit':634 'await':182,278,361,372,392,505,519,536,545,550,556,577,579,587,604,617 'azur':2,17,38,43,48,61,82,117,124,159,645,702,708,712,719 'azure-ai-voicel':37 'azure-ai-voicelive-pi':1 'azure-ident':42 'azure.ai.voicelive.aio':70,105,144,727 'azure.ai.voicelive.models':274 'azure.core.credentials':109 'azure.identity.aio':74,148 'azurecustomvoic':649 'azurekeycredenti':111,122 'azurepersonalvoic':651 'azurestandardvoic':648 'b64':366,375,548 'balanc':625 'ballad':641 'base64':346,350 'base64.b64decode':390,487 'base64.b64encode':368 'bash':34,47 'bidirect':14,30 'bit':355 'boundari':843 'break':218 'buffer':246,254 'build':6,22 'byte':389,396 'call':497,510,512 'calm':633 'cancel':243,574 'case':415,421,429,441,455,462,470,477,484,489,498,522,659 'chunk':353,360,370 'clarif':837 'clear':251,257,630,810 'close':758 'cognit':49,62,83,118,125,160 'cognitiveservices.azure.com':97,174 'cognitiveservices.azure.com/.default':96,173 'commit':250 'common':528 'communic':16,32 'complet':402,494,776 'concis':600 'config':267 'configur':236,271 'conn':100,135,177,203,384,410,567,736,741 'conn.conversation':258 'conn.conversation.item.create':506,588,605 'conn.input':244 'conn.input_audio_buffer.append':373,546 'conn.input_audio_buffer.commit':551 'conn.output':252 'conn.output_audio_buffer.clear':580 'conn.response':239 'conn.response.cancel':578 'conn.response.create':520,557,618 'conn.session':234 'conn.session.update':183,279,537 'conn.transcription':264 'connect':72,79,107,114,146,156,223,734,757,767 'connectionclos':730,752 'connectionerror':729,762 'content':594,611 'control':543 'convers':259,581,628 'conversation.item.input_audio_transcription.completed':456 'conversation.item.input_audio_transcription.delta':463 'coral':635 'core':221 'creat':242 'credenti':86,94,121,163,171 'criteria':846 'current':331,575 'decod':371 'deep':639 'def':152 'default':662,692 'defaultazurecredenti':66,76,87,150,164 'describ':798,814 'descript':329,622 'detail':771 'detect':308,540,687,706 'durat':319,699 'e':754,764,769 'e.code':759 'e.reason':760 'echo':295,626 'elif':215,397 'en':715 'end':450,481,552 'endpoint':51,80,85,115,120,157,162 'english':716 'environ':45,826 'environment-specif':825 'error':521,523,526,723,744,748,768 'etc':298 'eu':685 'event':198,201,206,382,403,408,414,428,454,469,565,739 'event.arguments':504 'event.audio':437,449 'event.call':514 'event.delta':391,467,480,488 'event.error.code':749 'event.error.message':527,750 'event.name':503 'event.response.id':476 'event.response.status':495 'event.session':420 'event.transcript':214,461 'event.type':207,209,216,386,398,412,569,743 'exampl':777 'except':751,761 'execut':793 'expert':831 'expos':227 'express':642 'f':205,212,418,433,445,458,465,473,492,525,746,756,766 'flush':482 'format':301,305,654,655 'friend':636 'function':325,496,502,509 'functiontool':277,323 'g711':676,681 'get':327,330 'gpt':90,130,167 'gpt-4o-realtime-preview':89,129,166 'handl':404,501,561,724 'hello':616 'help':189,286 'high':663 'histori':582 'id':513,515 'ident':44 'import':71,75,106,110,139,141,145,149,275,349,728 'input':248,299,427,596,613,840 'input_audio_buffer.speech':430,442,570 'instal':33,36 'instruct':181,185,282 'interrupt':560,573 'item':507,589,606 'item.create':261 'item.delete':262 'item.truncate':263 'json.dumps':517 'key':55,64,102,127,232 'limit':802 'listen':196 'live':20 'locat':337,341 'main':153,220 'manual':530,542 'match':411,811 'measur':640 'messag':586,591,603,608 'method':233 'microphon':365 'miss':848 'modal':191,289 'mode':532 'model':88,128,165,240,652,781 'mono':358 'ms':316,320,439,440,451,452,700 'multilingu':722 'name':326 'neutral':624 'none':541 'object':335 'optim':717 'option':620,688 'os':142 'os.environ':81,116,123,158 'output':256,303,511,516,820 'overview':801 'pad':315 'paramet':333 'partial':466 'pattern':529 'pcm':356 'pcm16':302,306,347,660,666,671 'pcm16-16000hz':670 'pcm16-8000hz':665 'permiss':841 'pip':35 'play':393 'prefer':67 'prefix':314 'preview':93,133,170 'print':204,211,400,417,423,432,444,457,464,472,479,491,524,745,755,765 'product':60 'profession':631 'properti':336 'purpos':231 'py':5 'python':68,103,138,272,348,379,405,535,562,583,689,725 'qualiti':664 'quick':136 'rate':657 'read':351,362 'real':8,24 'real-tim':7,23 'realtim':92,132,169 'receiv':377 'recommend':58 'refer':770,773 'references/api-reference.md':775 'references/examples.md':779 'references/models.md':784 'requestsess':276,281 'requir':340,839 'resourc':224,229,230 'respons':241,468,474,493,559,576 'response.audio.delta':387,485 'response.audio.done':399 'response.audio_transcript.delta':478 'response.audio_transcript.done':210 'response.created':471 'response.done':217,490 'response.function_call_arguments.done':499 'result':500,518 'review':832 'role':592,609 'safeti':842 'sage':297,632 'said':460 'sampl':656 'scope':95,172,813 'sdk':21 'see':774,778,783 'semant':703,709,713,720 'send':344 'server':310,690,694 'servic':50,63,84,119,126,161 'session':179,184,235,238,265,269,270,280,413,419,424,538 'session.created':416 'session.updated':422 'shimmer':296,629 'silenc':318,698 'skill':789,805 'skill-azure-ai-voicelive-py' 'smarter':705 'source-sickn33' 'specif':827 'speech':434,446 'start':137,431,435,438,475,571 'state':260 'stop':443,447,833 'storytel':644 'stream':343 'string':339 'substitut':823 'success':845 'system':585,593 'task':809 'telephoni':669,679,684 'test':829 'text':192,290,597,598,614,615 'threshold':312,696 'time':9,25 'tool':322 'topic-agent-skills' 'topic-agentic-skills' 'topic-ai-agent-skills' 'topic-ai-agents' 'topic-ai-coding' 'topic-ai-workflows' 'topic-antigravity' 'topic-antigravity-skills' 'topic-claude-code' 'topic-claude-code-skills' 'topic-codex-cli' 'topic-codex-skills' 'transcript':213,266,453 'treat':818 'tri':731 'trigger':558 'true':483 'turn':307,531,539,544,555,686 'type':309,324,334,338,508,590,595,607,612,693,707,711,718,782 'ulaw':677 'upbeat':637 'updat':178,237,268,425 'us':680 'use':647,658,787,803 'user':459,554,572,602,610 'vad':311,534,691,695,704,710,714,721 'valid':828 'variabl':46 'vers':643 'voic':10,19,26,194,287,292,619,621,646,674 'voicel':4,40 'voiceliveconnect':226 'warm':627 'weather':328,332 'websocket':15,31 'workflow':795","prices":[{"id":"f08bf506-d49a-4d00-9fc0-0cb4b85eae7d","listingId":"5a817f7b-7aac-4c91-b850-f524d10e007c","amountUsd":"0","unit":"free","nativeCurrency":null,"nativeAmount":null,"chain":null,"payTo":null,"paymentMethod":"skill-free","isPrimary":true,"details":{"org":"sickn33","category":"antigravity-awesome-skills","install_from":"skills.sh"},"createdAt":"2026-04-18T21:32:07.916Z"}],"sources":[{"listingId":"5a817f7b-7aac-4c91-b850-f524d10e007c","source":"github","sourceId":"sickn33/antigravity-awesome-skills/azure-ai-voicelive-py","sourceUrl":"https://github.com/sickn33/antigravity-awesome-skills/tree/main/skills/azure-ai-voicelive-py","isPrimary":false,"firstSeenAt":"2026-04-18T21:32:07.916Z","lastSeenAt":"2026-04-24T18:50:29.013Z"}],"details":{"listingId":"5a817f7b-7aac-4c91-b850-f524d10e007c","quickStartSnippet":null,"exampleRequest":null,"exampleResponse":null,"schema":null,"openapiUrl":null,"agentsTxtUrl":null,"citations":[],"useCases":[],"bestFor":[],"notFor":[],"kindDetails":{"org":"sickn33","slug":"azure-ai-voicelive-py","github":{"repo":"sickn33/antigravity-awesome-skills","stars":34928,"topics":["agent-skills","agentic-skills","ai-agent-skills","ai-agents","ai-coding","ai-workflows","antigravity","antigravity-skills","claude-code","claude-code-skills","codex-cli","codex-skills","cursor","cursor-skills","developer-tools","gemini-cli","gemini-skills","kiro","mcp","skill-library"],"license":"mit","html_url":"https://github.com/sickn33/antigravity-awesome-skills","pushed_at":"2026-04-24T06:41:17Z","description":"Installable GitHub library of 1,400+ agentic skills for Claude Code, Cursor, Codex CLI, Gemini CLI, Antigravity, and more. Includes installer CLI, bundles, workflows, and official/community skill collections.","skill_md_sha":"7cb9f9c421015b6056e33bf9e7e6e2dc07b95dab","skill_md_path":"skills/azure-ai-voicelive-py/SKILL.md","default_branch":"main","skill_tree_url":"https://github.com/sickn33/antigravity-awesome-skills/tree/main/skills/azure-ai-voicelive-py"},"layout":"multi","source":"github","category":"antigravity-awesome-skills","frontmatter":{"name":"azure-ai-voicelive-py","description":"Build real-time voice AI applications with bidirectional WebSocket communication."},"skills_sh_url":"https://skills.sh/sickn33/antigravity-awesome-skills/azure-ai-voicelive-py"},"updatedAt":"2026-04-24T18:50:29.013Z"}}