Replace or regenerate sections of audio using Stability AI's audio inpainting via MPP micropayments.
What it does
The Audio Inpaint endpoint on Stability AI's Locus MPP gateway lets you selectively replace a time-range within an existing audio file using a text prompt. You supply an input audio file, specify the masked region (start and end in seconds), and describe what the replacement content should sound like. The endpoint uses Stability AI's Stable Audio models (stable-audio-2 or stable-audio-2.5) to regenerate just that segment while preserving the surrounding audio. Output formats include MP3 and WAV.
Payment is handled via the MPP (Micropayment Protocol) with Tempo settlement on pathUSD. The listed price is approximately $0.23 per call (at 50 inference steps with stable-audio-2), though the exact amount may vary by model and step count. The endpoint is a POST-only route; the probe returned 404 on HEAD/GET, which is expected since the OpenAPI spec defines only a POST method. Required parameters are `audio` (the input audio file) and `prompt` (a description of the replacement content). Optional parameters include `mask_start`, `mask_end`, `model`, `steps`, and `output_format`.
This endpoint is part of a broader Stability AI service suite available through the Locus MPP gateway, which also includes image generation (Ultra, Core, SD3), image editing (inpaint, outpaint, erase, search-and-replace), upscaling, 3D model generation, and other audio endpoints (text-to-audio, audio-to-audio). API reference documentation is available at platform.stability.ai, and LLM-specific docs are at beta.paywithlocus.com/mpp/stability-ai.md.
Capabilities
Use cases
- —Replacing a corrupted or unwanted section of a podcast or music track with AI-generated content
- —Filling in gaps in audio recordings with contextually appropriate sound
- —Regenerating specific time segments of sound effects or ambient audio to match a text description
- —Editing audio for post-production by selectively replacing sections without re-recording
- —Automated audio repair pipelines that detect and replace problematic segments
Fit
Best for
- —Selectively replacing time-bounded segments within existing audio files
- —AI agents that need programmatic audio editing with per-call micropayments
- —Workflows requiring text-guided audio regeneration without full re-synthesis
- —Developers integrating audio repair into automated media pipelines
Not for
- —Full text-to-audio generation from scratch (use the text-to-audio endpoint instead)
- —Real-time streaming audio processing
- —Non-audio media editing (images, video)
Quick start
curl -X POST https://stability-ai.mpp.paywithlocus.com/stability-ai/audio-inpaint \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <MPP_TOKEN>" \
-d '{
"audio": "<base64-encoded-audio>",
"prompt": "gentle piano melody",
"mask_start": 10,
"mask_end": 25,
"model": "stable-audio-2.5",
"output_format": "mp3"
}'Example
Request
{
"audio": "<base64-encoded-audio-file>",
"model": "stable-audio-2",
"steps": 50,
"prompt": "upbeat jazz saxophone solo",
"mask_end": 45,
"mask_start": 15,
"output_format": "mp3"
}Endpoint
Quality
The OpenAPI spec provides a clear schema with parameter descriptions and pricing info. However, the probe did not capture a live 402 challenge (the endpoint is POST-only and was probed with HEAD/GET), no response schema or example response is documented, and crawled pages returned only 404 error JSON. Pricing is described textually rather than as a fixed base-unit amount.
Warnings
- —Probe returned 404 on HEAD and GET — endpoint is POST-only; liveness not confirmed via 402 challenge
- —Price amount is null in x-payment-info; the $0.23 figure comes from a description field and may vary by model/steps
- —No response schema documented — output format and structure must be inferred
- —No example response available in the spec or crawl data
Citations
- —Audio Inpaint endpoint accepts audio, prompt, mask_start, mask_end, model, steps, and output_format parametershttps://stability-ai.mpp.paywithlocus.com
- —Price is approximately $0.23 at 50 steps with stable-audio-2https://stability-ai.mpp.paywithlocus.com
- —Payment method is Tempo settlement with pathUSD currencyhttps://stability-ai.mpp.paywithlocus.com
- —API reference available at platform.stability.ai/docs/api-referencehttps://stability-ai.mpp.paywithlocus.com
- —LLM-specific documentation at beta.paywithlocus.com/mpp/stability-ai.mdhttps://stability-ai.mpp.paywithlocus.com