Configuration
Custom API keys, base URLs, and fetch implementations.
By default, SpeechSDK reads API keys from environment variables. Use provider factory functions when you need custom API keys, base URLs, or fetch implementations.
Factory Functions
Each provider exports a create* factory function from its subpath:
import { generateSpeech } from "@speech-sdk/core"
import { createOpenAI } from "@speech-sdk/core/openai"
const myOpenAI = createOpenAI({
apiKey: "sk-...",
baseURL: "https://my-proxy.com/v1",
})
const result = await generateSpeech({
model: myOpenAI("gpt-4o-mini-tts"),
text: "Hello!",
voice: "alloy",
})The factory returns a function that accepts an optional model ID. Call it without arguments to use the provider's default model:
const openai = createOpenAI({ apiKey: "sk-..." })
// Uses default model (gpt-4o-mini-tts)
generateSpeech({ model: openai(), text: "...", voice: "alloy" })
// Specify a model
generateSpeech({ model: openai("tts-1-hd"), text: "...", voice: "alloy" })Available Factories
| Import | Function |
|---|---|
@speech-sdk/core/openai | createOpenAI() |
@speech-sdk/core/elevenlabs | createElevenLabs() |
@speech-sdk/core/deepgram | createDeepgram() |
@speech-sdk/core/cartesia | createCartesia() |
@speech-sdk/core/hume | createHume() |
@speech-sdk/core/google | createGoogle() |
@speech-sdk/core/fish-audio | createFishAudio() |
@speech-sdk/core/unreal-speech | createUnrealSpeech() |
@speech-sdk/core/murf | createMurf() |
@speech-sdk/core/resemble | createResemble() |
@speech-sdk/core/fal-ai | createFal() |
@speech-sdk/core/mistral | createMistral() |
Configuration Options
All factory functions accept the same base options:
interface ProviderConfig {
apiKey?: string // Override the env var
baseURL?: string // Custom API endpoint (proxies, self-hosted)
fetch?: typeof globalThis.fetch // Custom fetch implementation
}Custom Fetch
Pass a custom fetch for logging, instrumentation, or environments without a global fetch:
import { createOpenAI } from "@speech-sdk/core/openai"
const openai = createOpenAI({
fetch: async (url, init) => {
console.log(`Requesting: ${url}`)
return globalThis.fetch(url, init)
},
})Request Options
Every generateSpeech call accepts these options:
generateSpeech({
model: string | ResolvedModel, // required
text: string, // required
voice: Voice, // required
providerOptions?: object, // provider-specific API params
maxRetries?: number, // default: 2
abortSignal?: AbortSignal, // cancel the request
headers?: Record<string, string>, // additional HTTP headers
});Abort Signal
Cancel an in-flight request:
const controller = new AbortController()
const promise = generateSpeech({
model: "openai/gpt-4o-mini-tts",
text: "Hello!",
voice: "alloy",
abortSignal: controller.signal,
})
// Cancel after 5 seconds
setTimeout(() => controller.abort(), 5000)Custom Headers
Pass additional HTTP headers to the provider's API:
const result = await generateSpeech({
model: "openai/gpt-4o-mini-tts",
text: "Hello!",
voice: "alloy",
headers: {
"X-Custom-Header": "value",
},
})Retries
SpeechSDK retries on 5xx and network errors with exponential backoff. Does not retry 4xx errors. Default: 2 retries.
const result = await generateSpeech({
model: "openai/gpt-4o-mini-tts",
text: "Hello!",
voice: "alloy",
maxRetries: 5,
})