All Providers

SpeechSDK supports 14 providers out of the box. Use provider/model strings to select a provider and model, or pass just the provider name to use its default model.

Browse the full list of models on the Models page, or jump into a provider page below.

Provider Table

Provider	Prefix	Default Model	Env Var
OpenAI	`openai`	`gpt-4o-mini-tts`	`OPENAI_API_KEY`
ElevenLabs	`elevenlabs`	`eleven_multilingual_v2`	`ELEVENLABS_API_KEY`
Deepgram	`deepgram`	`aura-2`	`DEEPGRAM_API_KEY`
Cartesia	`cartesia`	`sonic-3`	`CARTESIA_API_KEY`
Hume	`hume`	`octave-2`	`HUME_API_KEY`
Google	`google`	`gemini-2.5-flash-preview-tts`	`GOOGLE_API_KEY`
Fish Audio	`fish-audio`	`s2-pro`	`FISH_AUDIO_API_KEY`
Inworld	`inworld`	`inworld-tts-1.5-max`	`INWORLD_API_KEY`
Murf	`murf`	`GEN2`	`MURF_API_KEY`
Resemble	`resemble`	`default`	`RESEMBLE_API_KEY`
Smallest AI	`smallest-ai`	`lightning_v3.1`	`SMALLEST_API_KEY`
fal	`fal-ai`	(user-specified)	`FAL_API_KEY`
Mistral	`mistral`	`voxtral-mini-tts-2603`	`MISTRAL_API_KEY`
xAI	`xai`	`grok-tts`	`XAI_API_KEY`

Capability Matrix

Provider	Streaming	Audio Tags	Voice Cloning	Timestamps	Open Source
OpenAI	Yes	Yes (as instructions)	No	Timestamp fallback	No
ElevenLabs	Yes	Yes (`eleven_v3`)	No	Native	No
Deepgram	Yes	No	No	Timestamp fallback	No
Cartesia	Yes	Yes (`sonic-3`)	Yes (`sonic-3`)	Native	No
Hume	Yes	No	Yes (`octave-2`)	Native (`octave-2`)	No
Google	Yes	No	No	Timestamp fallback	No
Fish Audio	Yes	Yes	Yes	Timestamp fallback	Yes
Inworld	Yes	No	No	Native	No
Murf	No	No	No	Native (`GEN2`)	No
Resemble	Yes	No	Yes	Native	Yes
Smallest AI	No	No	No	Timestamp fallback	No
fal	No	No	Yes (select models)	Timestamp fallback	Varies
Mistral	No	No	Yes	Timestamp fallback	Yes
xAI	Yes	Yes (`grok-tts`)	No	Timestamp fallback	No

Support is per-model — check each provider page for the per-model features. "Timestamp fallback" means timestamps: true works through fallback timestamp recovery (OpenAI Whisper by default in the standalone SDK); see the timestamps guide for details.

Usage

import { generateSpeech } from "@speech-sdk/core"

// provider/model string
await generateSpeech({
  model: "openai/gpt-4o-mini-tts",
  text: "Hello from SpeechSDK!",
  voice: "alloy",
})

// just the provider name uses the default model
await generateSpeech({
  model: "elevenlabs",
  text: "Hello from SpeechSDK!",
  voice: "EXAVITQu4vr4xnSDxMaL",
})

Provider Options

Each provider accepts provider-specific parameters via providerOptions. These are sent directly to the provider's API using the API's own field names.

const result = await generateSpeech({
  model: "openai/gpt-4o-mini-tts",
  text: "Hello!",
  voice: "alloy",
  providerOptions: {
    speed: 1.2,
    response_format: "opus",
  },
})

API Key Resolution

When using string models (e.g., 'openai/gpt-4o-mini-tts'), API keys are resolved from environment variables automatically (see the table above). You can override this with custom configuration.