Skillquality 0.70

azure-speech

Expert knowledge for Azure AI Speech development including troubleshooting, best practices, decision making, limits & quotas, security, configuration, integrations & coding patterns, and deployment. Use when building STT/TTS, custom voices/avatars, batch TTS, Voice Live, or conta

Price
free
Protocol
skill
Verified
no

What it does

Azure AI Speech Skill

This skill provides expert guidance for Azure AI Speech. Covers troubleshooting, best practices, decision making, limits & quotas, security, configuration, integrations & coding patterns, and deployment. It combines local quick-reference content with remote documentation fetching capabilities.

How to Use This Skill

IMPORTANT for Agent: Use the Category Index below to locate relevant sections. For categories with line ranges (e.g., L35-L120), use read_file with the specified lines. For categories with file links (e.g., [security.md](security.md)), use read_file on the linked reference file

IMPORTANT for Agent: If metadata.generated_at is more than 3 months old, suggest the user pull the latest version from the repository. If mcp_microsoftdocs tools are not available, suggest the user install it: Installation Guide

This skill requires network access to fetch documentation content:

  • Preferred: Use mcp_microsoftdocs:microsoft_docs_fetch with query string from=learn-agent-skill. Returns Markdown.
  • Fallback: Use fetch_webpage with query string from=learn-agent-skill&accept=text/markdown. Returns Markdown.

Category Index

CategoryLinesDescription
TroubleshootingL36-L45Diagnosing and fixing common Azure Speech issues (SDK, text-to-speech, Foundry, containers, CRL), plus how to capture session/transcription IDs for support.
Best PracticesL46-L62Best practices for audio/video prep, custom voice/avatars, latency and memory tuning, phrase/keyword optimization, and handling real-time Voice Live interactions and interruptions
Decision MakingL63-L81Guides for choosing speech features, planning large-scale/batch use, evaluating models/devices, checking availability, and migrating between Speech API versions and services.
Limits & QuotasL82-L90Quotas, limits, and usage patterns for Azure Speech: batch TTS, custom/pro voice training & deployment, and short audio STT, plus throttling and capacity planning guidance.
SecurityL91-L102Configuring security for Azure AI Speech: auth (Entra, RBAC), network isolation (VNet, Private Link, sovereign clouds), BYOS storage, encryption/keys, and voice talent consent management.
ConfigurationL103-L132Configuring Azure AI Speech behavior: SDK/CLI settings, audio I/O, logging, storage, SSML, pronunciation, batch jobs, custom speech/voice, avatars, and Voice Live API options.
Integrations & Coding PatternsL133-L157Patterns and code for integrating Azure Speech with apps and agents: SDK/REST usage, TTS/translation/avatars, call center and Voice Live, OpenAI/Foundry, consent, and automation.
DeploymentL158-L169Deploying and scaling Azure AI Speech: Docker/Kubernetes containers, on-prem STT/TTS, custom speech models/endpoints, language ID, and batch/long-form synthesis workflows.

Troubleshooting

TopicURL
Resolve common Azure text-to-speech service issueshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/faq-tts
Retrieve Speech to text session and transcription IDs for supporthttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-get-speech-session-id
Resolve common Azure Speech in Foundry issueshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/known-issues
Resolve Azure AI Speech SDK CRL compatibility issueshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/migrate-to-sdk-1-48-2
Troubleshoot Speech service container deploymentshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-container-faq
Troubleshoot common Azure Speech SDK issueshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/troubleshooting

Best Practices

TopicURL
Prepare and locate audio data for batch transcriptionhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/batch-transcription-audio-data
Create high-quality human-labeled speech transcriptionshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-custom-speech-human-labeled-transcriptions
Prepare training data for professional custom voicehttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-custom-voice-training-data
Apply best practices to reduce Speech synthesis latencyhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-lower-speech-synthesis-latency
Track and manage Azure Speech SDK memory usagehttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-track-speech-sdk-memory-usage
Handle user interruptions and chat truncation in Voice Livehttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-voice-live-auto-truncation
Use interim responses in Voice Live to reduce latency gapshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-voice-live-interim-response
Improve speech recognition with phrase listshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/improve-accuracy-phrase-list
Apply keyword recognition design and accuracy guidelineshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/keyword-recognition-guidelines
Record high-quality samples for custom voice traininghttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/record-custom-voice-samples
Back up and recover custom Speech and Voice resourceshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/resiliency-and-recovery-plan
Design microphone arrays optimized for Speech SDKhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-sdk-microphone
Prepare high-quality video samples for custom avatarshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/text-to-speech-avatar/custom-avatar-record-video-samples

Decision Making

TopicURL
Plan large-scale transcription with batch processinghttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/batch-transcription
Evaluate custom voice lite before professional voicehttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/custom-neural-voice-lite
Choose Embedded Speech for offline and hybrid scenarioshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/embedded-speech
Evaluate device suitability for embedded speech modelshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/embedded-speech-performance-evaluations
Evaluate and compare custom speech model accuracyhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-custom-speech-inspect-data
Check Azure Speech language and voice availabilityhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/language-support
Migrate Speech to text REST API from v3.2 to 2024-11-15https://learn.microsoft.com/en-us/azure/ai-services/speech-service/migrate-2024-11-15
Migrate Speech to text REST API from 2024-11-15 to 2025-10-15https://learn.microsoft.com/en-us/azure/ai-services/speech-service/migrate-2025-10-15
Migrate from retired Speech intent recognition to Language or OpenAIhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/migrate-intent-recognition
Migrate from Long Audio API to Batch synthesishttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/migrate-to-batch-synthesis
Migrate from v3 text-to-speech to custom voice REST APIhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/migrate-to-custom-voice-api
Migrate Speech-to-text REST from v3.0 to v3.1https://learn.microsoft.com/en-us/azure/ai-services/speech-service/migrate-v3-0-to-v3-1
Migrate Speech-to-text REST from v3.1 to v3.2https://learn.microsoft.com/en-us/azure/ai-services/speech-service/migrate-v3-1-to-v3-2
Assess capabilities and regions for personal voicehttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/personal-voice-overview
Decide when to use Whisper for speech taskshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/whisper-overview

Limits & Quotas

TopicURL
Manage custom speech model and endpoint lifecyclehttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-custom-speech-model-and-endpoint-lifecycle
Deploy professional voice models to custom endpointshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/professional-voice-deploy-endpoint
Train professional voice models and understand durationhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/professional-voice-train-voice
Use Speech-to-text REST API for short audiohttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/rest-speech-to-text-short
Apply Azure Speech quotas, limits, and throttling guidancehttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-services-quotas-and-limits

Security

TopicURL
Configure BYOS storage for Azure Speech resourceshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/bring-your-own-storage-speech-resource
Configure Microsoft Entra authentication for Speech SDKhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-configure-azure-ad-auth
Manage voice talent consent for professional voicehttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/professional-voice-create-consent
Assign Azure RBAC roles for Speech resourceshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/role-based-access-control
Use Azure Speech service in sovereign cloudshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/sovereign-clouds
Manage Speech service data-at-rest encryption and keyshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-encryption-of-data-at-rest
Secure Speech service with Virtual Network service endpointshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-service-vnet-service-endpoint
Secure Azure AI Speech with Private Link endpointshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-services-private-link

Configuration

TopicURL
Configure Batch synthesis properties for text-to-speechhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/batch-synthesis-properties
Check status and retrieve batch transcription resultshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/batch-transcription-get
Configure BYOS storage for Speech to texthttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/bring-your-own-storage-speech-resource-speech-to-text
Define UPS phonetic pronunciations for Speech to texthttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/customize-pronunciation
Configure OpenSSL on Linux for Azure Speech SDKhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-configure-openssl-linux
Control and monitor Speech SDK service connectionshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-control-connections
Create and manage custom speech fine-tuning projectshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-custom-speech-create-project
Prepare and upload datasets for custom speech traininghttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-custom-speech-upload-data
Select and configure audio input devices in Speech SDKhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-select-audio-input-devices
Use visemes for facial animation with Speech servicehttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-speech-synthesis-viseme
Configure Speech SDK audio input streamshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-use-audio-input-streams
Configure compressed audio input for Speech SDK and CLIhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-use-codec-compressed-audio-input-streams
Enable and configure Speech SDK diagnostic logginghttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-use-logging
Configure audio and transcription logging for Speech recognitionhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/logging-audio-transcription
Upload and validate training datasets for professional voicehttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/professional-voice-create-training-set
Use correct regional endpoints for Azure Speechhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/regions
Configure Speech containers storage, logging, and securityhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-container-configuration
Control speech output using SSML configurationhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-synthesis-markup
Configure pronunciation with SSML phonemes and lexiconshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-synthesis-markup-pronunciation
Structure SSML documents and events for Speechhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-synthesis-markup-structure
Configure Speech CLI datastore search order and fileshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/spx-data-store-configuration
Configure output destinations for Speech CLI resultshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/spx-output-options
Configure batch synthesis properties for TTS avatarshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/text-to-speech-avatar/batch-synthesis-avatar-properties
Reference Voice Live API events, models, and settings (2025-10-01)https://learn.microsoft.com/en-us/azure/ai-services/speech-service/voice-live-api-reference-2025-10-01
Reference Voice Live API events and settings (2026-01-01-preview)https://learn.microsoft.com/en-us/azure/ai-services/speech-service/voice-live-api-reference-2026-01-01-preview
Configure language and locale for Voice Live APIhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/voice-live-language-support

Integrations & Coding Patterns

TopicURL
Integrate Speech service with call center telephonyhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/call-center-telephony-integration
Call Azure Speech fast transcription API in Foundryhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/fast-transcription-create
Use Speech SDK APIs to handle recognition resultshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/get-speech-recognition-results
Integrate custom models with Voice Live BYOMhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-bring-your-own-model
Implement text-to-speech synthesis with Speech SDKhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-speech-synthesis
Implement speech translation with Azure Speech SDKhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-translate-speech
Connect MCP servers to Azure Voice Live sessionshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-voice-live-mcp-server
Add proactive greetings to Voice Live agentshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-voice-live-proactive-messages
Integrate with Azure LLM Speech transcription and translation APIhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/llm-speech
Integrate Azure Speech with Azure OpenAI chathttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/openai-speech
Add and manage user consent for personal voicehttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/personal-voice-create-consent
Create personal voice projects via Custom Voice APIhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/personal-voice-create-project
Use Power Automate connector for Speech batch transcriptionhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/power-automate-batch-transcription
Use Speech to text REST API endpoints and parametershttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/rest-speech-to-text
Call Text-to-speech REST API for voice synthesishttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/rest-text-to-speech
Use SSML phonetic alphabets with Azure Speechhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-ssml-phonetic-sets
Use SSML to customize Azure Speech voiceshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-synthesis-markup-voice
Generate Speech service REST clients from Swaggerhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/swagger-documentation
Control text to speech avatar gestures with SSMLhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/text-to-speech-avatar/avatar-gestures-with-ssml
Implement real-time text-to-speech avatar streaminghttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/text-to-speech-avatar/real-time-synthesis-avatar
Use Voice Live WebSocket API for real-time agentshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/voice-live-how-to

Deployment

TopicURL
Use Batch synthesis API for long-form text-to-speechhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/batch-synthesis
Deploy custom speech models and endpointshttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/how-to-custom-speech-deploy-model
Scale Speech containers with batch processing kithttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-container-batch-processing
Run custom speech to text containers with Dockerhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-container-cstt
Deploy and run Speech containers with Dockerhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-container-howto
Run Speech containers on Kubernetes with Helmhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-container-howto-on-premises
Deploy language identification containers with Dockerhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-container-lid
Deploy neural text to speech containers with Dockerhttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-container-ntts
Deploy speech to text containers for on-premises usehttps://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-container-stt

Capabilities

skillsource-microsoftdocsskill-azure-speechtopic-agenttopic-agent-skillstopic-agentic-skillstopic-agentskilltopic-ai-agentstopic-ai-codingtopic-azuretopic-azure-functionstopic-azure-kubernetes-servicetopic-azure-openaitopic-azure-sql-databasetopic-azure-storage

Install

Installnpx skills add MicrosoftDocs/Agent-Skills
Transportskills-sh
Protocolskill

Quality

0.70/ 1.00

deterministic score 0.70 from registry signals: · indexed on github topic:agent-skills · 549 github stars · SKILL.md body (19,317 chars)

Provenance

Indexed fromgithub
Enriched2026-05-18 18:53:59Z · deterministic:skill-github:v1 · v1
First seen2026-04-18
Last seen2026-05-18

Agent access