Providers
All supported text-to-speech providers, their models, and configuration.
SpeechSDK supports 12 providers out of the box. Use provider/model strings to select a provider and model, or pass just the provider name to use its default model.
Browse the full list of providers and models on the Models page.
Provider Table
| Provider | Prefix | Default Model | Env Var |
|---|---|---|---|
| OpenAI | openai | gpt-4o-mini-tts | OPENAI_API_KEY |
| ElevenLabs | elevenlabs | eleven_multilingual_v2 | ELEVENLABS_API_KEY |
| Deepgram | deepgram | aura-2 | DEEPGRAM_API_KEY |
| Cartesia | cartesia | sonic-3 | CARTESIA_API_KEY |
| Hume | hume | octave-2 | HUME_API_KEY |
google | gemini-2.5-flash-preview-tts | GOOGLE_API_KEY | |
| Fish Audio | fish-audio | s2-pro | FISH_AUDIO_API_KEY |
| Unreal Speech | unreal-speech | default | UNREAL_SPEECH_API_KEY |
| Murf | murf | GEN2 | MURF_API_KEY |
| Resemble | resemble | default | RESEMBLE_API_KEY |
| fal | fal-ai | (user-specified) | FAL_API_KEY |
| Mistral | mistral | voxtral-mini-tts-2603 | MISTRAL_API_KEY |
Usage Examples
OpenAI
import { generateSpeech } from "@speech-sdk/core"
const result = await generateSpeech({
model: "openai/gpt-4o-mini-tts",
text: "Hello from SpeechSDK!",
voice: "alloy",
})OpenAI models: gpt-4o-mini-tts, tts-1, tts-1-hd
ElevenLabs
const result = await generateSpeech({
model: "elevenlabs/eleven_multilingual_v2",
text: "Hello from SpeechSDK!",
voice: "EXAVITQu4vr4xnSDxMaL",
})ElevenLabs models: eleven_v3, eleven_multilingual_v2, eleven_flash_v2_5, eleven_flash_v2
Deepgram
const result = await generateSpeech({
model: "deepgram/aura-2",
text: "Hello from SpeechSDK!",
voice: "thalia-en",
})Cartesia
const result = await generateSpeech({
model: "cartesia/sonic-3",
text: "Hello from SpeechSDK!",
voice: "a0e99841-438c-4a64-b679-ae501e7d6091",
})Cartesia models: sonic-3, sonic-2
Google (Gemini TTS)
const result = await generateSpeech({
model: "google/gemini-2.5-flash-preview-tts",
text: "Hello from SpeechSDK!",
voice: "Kore",
})Hume
const result = await generateSpeech({
model: "hume/octave-2",
text: "Hello from SpeechSDK!",
voice: "Dacher",
})Mistral
const result = await generateSpeech({
model: "mistral/voxtral-mini-tts-2603",
text: "Hello from SpeechSDK!",
voice: { audio: "base64-encoded-audio..." },
})Mistral uses voice cloning by default — pass a voice object with reference audio.
Provider Options
Each provider accepts provider-specific parameters via providerOptions. These are sent directly to the provider's API using the API's own field names.
const result = await generateSpeech({
model: "openai/gpt-4o-mini-tts",
text: "Hello!",
voice: "alloy",
providerOptions: {
speed: 1.2,
response_format: "opus",
},
})API Key Resolution
When using string models (e.g., 'openai/tts-1'), API keys are resolved from environment variables automatically (see the table above). You can override this with custom configuration.