Providers
OpenAI
OpenAI text-to-speech models (gpt-4o-mini-tts, tts-1, tts-1-hd).
| Prefix | openai |
| Default model | gpt-4o-mini-tts |
| Env var | OPENAI_API_KEY |
| Official docs | platform.openai.com/docs/guides/text-to-speech |
Models
| Model | Streaming | Audio Tags | Voice Cloning | Notes |
|---|---|---|---|---|
gpt-4o-mini-tts | Yes | Yes (via instructions) | No | Steerable; tags become instructions |
tts-1 | Yes | No | No | Low-latency, fixed voices |
tts-1-hd | Yes | No | No | Higher quality, fixed voices |
Usage
import { generateSpeech } from "@speech-sdk/core"
const result = await generateSpeech({
model: "openai/gpt-4o-mini-tts",
text: "Hello from SpeechSDK!",
voice: "alloy",
})Built-in voices: alloy, ash, ballad, coral, echo, fable, onyx, nova, sage, shimmer, verse.
Audio Tags
gpt-4o-mini-tts is steerable — SpeechSDK maps standardized audio tags in your text to the OpenAI instructions field:
await generateSpeech({
model: "openai/gpt-4o-mini-tts",
text: "[cheerful] Welcome back!",
voice: "alloy",
})tts-1 and tts-1-hd do not accept instructions — tags are stripped and a warning is returned.
Provider Options
await generateSpeech({
model: "openai/gpt-4o-mini-tts",
text: "Hello!",
voice: "alloy",
providerOptions: {
speed: 1.2,
response_format: "opus", // mp3 | opus | aac | flac | wav | pcm
instructions: "Speak with a warm, friendly tone.",
},
})Custom Configuration
import { generateSpeech } from "@speech-sdk/core"
import { createOpenAI } from "@speech-sdk/core/providers"
const openai = createOpenAI({
apiKey: process.env.OPENAI_API_KEY,
baseURL: "https://my-proxy.com/v1",
})
const result = await generateSpeech({
model: openai("gpt-4o-mini-tts"),
text: "Hello!",
voice: "alloy",
})