Skillquality 0.70

azure-speech

Expert knowledge for Azure AI Speech development including troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when using Speech/Voice SDKs, STT/TTS, custom

Price
free
Protocol
skill
Verified
no

What it does

Azure AI Speech Skill

This skill provides expert guidance for Azure AI Speech. Covers troubleshooting, best practices, decision making, architecture & design patterns, limits & quotas, security, configuration, integrations & coding patterns, and deployment. It combines local quick-reference content with remote documentation fetching capabilities.

How to Use This Skill

IMPORTANT for Agent: Use the Category Index below to locate relevant sections. For categories with line ranges (e.g., L35-L120), use read_file with the specified lines. For categories with file links (e.g., [security.md](security.md)), use read_file on the linked reference file

IMPORTANT for Agent: If metadata.generated_at is more than 3 months old, suggest the user pull the latest version from the repository. If mcp_microsoftdocs tools are not available, suggest the user install it: Installation Guide

This skill requires network access to fetch documentation content:

  • Preferred: Use mcp_microsoftdocs:microsoft_docs_fetch with query string from=learn-agent-skill. Returns Markdown.
  • Fallback: Use fetch_webpage with query string from=learn-agent-skill&accept=text/markdown. Returns Markdown.

Category Index

CategoryLinesDescription
TroubleshootingL37-L47Diagnosing and fixing common Azure Speech/Text-to-Speech/Voice Live API and SDK errors, container and Foundry issues, CRL/compatibility problems, and retrieving session/transcription IDs for support.
Best PracticesL48-L64Best practices for audio/video prep, custom voice/avatars, latency and memory tuning, phrase/keyword optimization, and handling real-time Voice Live interactions and interruptions
Decision MakingL65-L83Guidance on choosing speech features (batch STT, custom/embedded/personal/Whisper), evaluating models/devices, and step‑by‑step migration between Speech API versions and services
Architecture & Design PatternsL84-L88Architectural guidance for building call center voice agents using Azure AI Speech with Voice Live and Azure Communication Services, including integration patterns and design best practices.
Limits & QuotasL89-L97Quotas, limits, and usage patterns for Azure Speech: batch TTS, custom/pro voice training & deployment, and short audio STT, plus throttling and capacity planning guidance.
SecurityL98-L109Configuring security for Azure AI Speech: auth (Entra, RBAC), network isolation (VNet, Private Link, sovereign clouds), BYOS storage, encryption/keys, and voice talent consent management.
ConfigurationL110-L144Configuring Azure AI Speech/Voice: audio inputs, logging, storage, SSML, languages/voices, custom speech & voice training, batch/real-time settings, and Voice Live/avatars options.
Integrations & Coding PatternsL145-L168Integrating Azure AI Speech into apps and voice agents: SDK/REST usage, telephony, TTS/avatars, translation, LLM/Foundry/Voice Live flows, consent, and automation patterns.
DeploymentL169-L180Deploying and scaling Azure AI Speech: Docker/Kubernetes containers, on-prem STT/TTS, custom speech models/endpoints, language ID, and batch/long-form synthesis workflows.

Troubleshooting

TopicURL
Troubleshoot common Azure text to speech issueshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/faq-tts
Retrieve Speech to text session and transcription IDs for supporthttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-get-speech-session-id
Resolve common Azure Speech in Foundry issueshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/known-issues
Resolve Azure AI Speech SDK CRL compatibility issueshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/migrate-to-sdk-1-48-2
Troubleshoot Speech service container deploymentshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-container-faq
Troubleshoot common Azure Speech SDK issueshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/troubleshooting
Resolve common Voice Live API issues in Speechhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/voice-live-faq

Best Practices

TopicURL
Prepare and locate audio data for batch transcriptionhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/batch-transcription-audio-data
Create high-quality human-labeled speech transcriptionshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-custom-speech-human-labeled-transcriptions
Prepare training data for professional custom voicehttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-custom-voice-training-data
Apply best practices to reduce Speech synthesis latencyhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-lower-speech-synthesis-latency
Track and manage Azure Speech SDK memory usagehttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-track-speech-sdk-memory-usage
Handle user interruptions and chat truncation in Voice Livehttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-voice-live-auto-truncation
Use interim responses in Voice Live to reduce latency gapshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-voice-live-interim-response
Improve speech recognition with phrase listshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/improve-accuracy-phrase-list
Apply keyword recognition design and accuracy guidelineshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/keyword-recognition-guidelines
Record high-quality samples for custom voice traininghttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/record-custom-voice-samples
Back up and recover custom Speech and Voice resourceshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/resiliency-and-recovery-plan
Design microphone arrays optimized for Speech SDKhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-sdk-microphone
Prepare high-quality video samples for custom avatarshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/text-to-speech-avatar/custom-avatar-record-video-samples

Decision Making

TopicURL
Plan large-scale transcription with batch processinghttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/batch-transcription
Evaluate custom voice lite before professional voicehttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/custom-neural-voice-lite
Choose Embedded Speech for offline and hybrid scenarioshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/embedded-speech
Evaluate device suitability for embedded speech modelshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/embedded-speech-performance-evaluations
Evaluate and compare custom speech model accuracyhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-custom-speech-inspect-data
Train custom speech models and understand cost behaviorhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-custom-speech-train-model
Migrate Speech to text REST API from v3.2 to 2024-11-15https://learn.microsoft.com/en-us/azure/ai-services/speech-service/migrate-2024-11-15
Migrate Speech to text REST API from 2024-11-15 to 2025-10-15https://learn.microsoft.com/en-us/azure/ai-services/speech-service/migrate-2025-10-15
Migrate from retired Speech intent recognition to Language or OpenAIhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/migrate-intent-recognition
Migrate from Long Audio API to Batch synthesishttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/migrate-to-batch-synthesis
Migrate from v3 text-to-speech to custom voice REST APIhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/migrate-to-custom-voice-api
Migrate Speech-to-text REST from v3.0 to v3.1https://learn.microsoft.com/en-us/azure/ai-services/speech-service/migrate-v3-0-to-v3-1
Migrate Speech-to-text REST from v3.1 to v3.2https://learn.microsoft.com/en-us/azure/ai-services/speech-service/migrate-v3-1-to-v3-2
Assess capabilities and regions for personal voicehttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/personal-voice-overview
Decide when to use Whisper for speech taskshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/whisper-overview

Architecture & Design Patterns

TopicURL
Design call center voice agents with Voice Live and ACShttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/voice-live-telephony

Limits & Quotas

TopicURL
Manage custom speech model and endpoint lifecyclehttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-custom-speech-model-and-endpoint-lifecycle
Deploy professional voice models to custom endpointshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/professional-voice-deploy-endpoint
Train professional voice models and understand durationhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/professional-voice-train-voice
Use Speech-to-text REST API for short audiohttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/rest-speech-to-text-short
Apply Azure Speech quotas, limits, and throttling guidancehttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-services-quotas-and-limits

Security

TopicURL
Configure BYOS storage for Azure Speech resourceshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/bring-your-own-storage-speech-resource
Configure Microsoft Entra authentication for Speech SDKhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-configure-azure-ad-auth
Manage voice talent consent for professional voicehttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/professional-voice-create-consent
Assign Azure RBAC roles for Speech resourceshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/role-based-access-control
Use Azure Speech service in sovereign cloudshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/sovereign-clouds
Manage Speech service data-at-rest encryption and keyshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-encryption-of-data-at-rest
Secure Speech service with Virtual Network service endpointshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-service-vnet-service-endpoint
Secure Azure AI Speech with Private Link endpointshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-services-private-link

Configuration

TopicURL
Configure Batch synthesis properties for text-to-speechhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/batch-synthesis-properties
Check status and retrieve batch transcription resultshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/batch-transcription-get
Configure BYOS storage for Speech to texthttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/bring-your-own-storage-speech-resource-speech-to-text
Define UPS phonetic pronunciations for Speech to texthttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/customize-pronunciation
Configure OpenSSL on Linux for Azure Speech SDKhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-configure-openssl-linux
Control and monitor Speech SDK service connectionshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-control-connections
Create and manage custom speech fine-tuning projectshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-custom-speech-create-project
Prepare and upload datasets for custom speech traininghttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-custom-speech-upload-data
Configure post-processing options for Speech recognitionhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-post-processing
Configure real-time speech recognition inputs and optionshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-recognize-speech
Select and configure audio input devices in Speech SDKhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-select-audio-input-devices
Use visemes for facial animation with Speech servicehttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-speech-synthesis-viseme
Configure Speech SDK audio input streamshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-use-audio-input-streams
Configure compressed audio input for Speech SDK and CLIhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-use-codec-compressed-audio-input-streams
Enable and configure Speech SDK diagnostic logginghttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-use-logging
Check Azure Speech language and voice availabilityhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/language-support
Configure audio and transcription logging for Speech recognitionhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/logging-audio-transcription
Upload and validate training datasets for professional voicehttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/professional-voice-create-training-set
Use correct regional endpoints for Azure Speechhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/regions
Configure Speech containers storage, logging, and securityhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-container-configuration
Control speech output using SSML configurationhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-synthesis-markup
Configure pronunciation with SSML phonemes and lexiconshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-synthesis-markup-pronunciation
Structure SSML documents and events for Speechhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-synthesis-markup-structure
Configure voice and sound using SSML in Speechhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-synthesis-markup-voice
Configure Speech CLI datastore search order and fileshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/spx-data-store-configuration
Configure output destinations for Speech CLI resultshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/spx-output-options
Configure batch synthesis properties for TTS avatarshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/text-to-speech-avatar/batch-synthesis-avatar-properties
Reference Voice Live API events, models, and settings (2025-10-01)https://learn.microsoft.com/en-us/azure/ai-services/speech-service/voice-live-api-reference-2025-10-01
Reference Voice Live API events and settings (2026-01-01-preview)https://learn.microsoft.com/en-us/azure/ai-services/speech-service/voice-live-api-reference-2026-01-01-preview
Customize Voice Live input and output modelshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/voice-live-how-to-customize
Configure language and locale for Voice Live APIhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/voice-live-language-support

Integrations & Coding Patterns

TopicURL
Integrate Speech service with call center telephonyhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/call-center-telephony-integration
Call Azure Speech fast transcription API in Foundryhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/fast-transcription-create
Use Speech SDK APIs to handle recognition resultshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/get-speech-recognition-results
Integrate custom models with Voice Live BYOMhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-bring-your-own-model
Implement text-to-speech synthesis with Speech SDKhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-speech-synthesis
Implement speech translation with Azure Speech SDKhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-translate-speech
Build real-time voice agents with Voice Live and Foundry Agent Servicehttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-voice-agent-integration
Add proactive greetings to Voice Live agentshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-voice-live-proactive-messages
Call the LLM-speech API for transcription and translationhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/llm-speech
Integrate Azure Speech with Azure OpenAI chathttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/openai-speech
Add and manage user consent for personal voicehttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/personal-voice-create-consent
Create personal voice projects via Custom Voice APIhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/personal-voice-create-project
Use Power Automate connector for Speech batch transcriptionhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/power-automate-batch-transcription
Use Speech to text REST API endpoints and parametershttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/rest-speech-to-text
Call Text-to-speech REST API for voice synthesishttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/rest-text-to-speech
Use SSML phonetic alphabets with Azure Speechhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-ssml-phonetic-sets
Generate Speech service REST clients from Swaggerhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/swagger-documentation
Control text to speech avatar gestures with SSMLhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/text-to-speech-avatar/avatar-gestures-with-ssml
Implement real-time text-to-speech avatar streaminghttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/text-to-speech-avatar/real-time-synthesis-avatar
Use Voice Live WebSocket events and propertieshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/voice-live-how-to

Deployment

TopicURL
Use Batch synthesis API for long-form text-to-speechhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/batch-synthesis
Deploy custom speech models and endpointshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-custom-speech-deploy-model
Scale Speech containers with batch processing kithttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-container-batch-processing
Run custom speech to text containers with Dockerhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-container-cstt
Deploy and run Speech containers with Dockerhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-container-howto
Run Speech containers on Kubernetes with Helmhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-container-howto-on-premises
Deploy language identification containers with Dockerhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-container-lid
Deploy neural text to speech containers with Dockerhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-container-ntts
Deploy speech to text containers for on-premises usehttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-container-stt

Capabilities

skillsource-microsoftdocsskill-azure-speechtopic-agenttopic-agent-skillstopic-agentic-skillstopic-agentskilltopic-ai-agentstopic-ai-codingtopic-azuretopic-azure-functionstopic-azure-kubernetes-servicetopic-azure-openaitopic-azure-sql-databasetopic-azure-storage

Install

Installnpx skills add MicrosoftDocs/Agent-Skills
Transportskills-sh
Protocolskill

Quality

0.70/ 1.00

deterministic score 0.70 from registry signals: · indexed on github topic:agent-skills · 497 github stars · SKILL.md body (20,617 chars)

Provenance

Indexed fromgithub
Enriched2026-04-22 00:53:37Z · deterministic:skill-github:v1 · v1
First seen2026-04-18
Last seen2026-04-22

Agent access