Configuration

Custom API keys, base URLs, and fetch implementations.

By default, SpeechSDK reads API keys from environment variables. Use provider factory functions when you need custom API keys, base URLs, or fetch implementations.

Factory Functions

Each provider exports a create* factory function from its subpath:

import { generateSpeech } from "@speech-sdk/core"
import { createOpenAI } from "@speech-sdk/core/openai"

const myOpenAI = createOpenAI({
  apiKey: "sk-...",
  baseURL: "https://my-proxy.com/v1",
})

const result = await generateSpeech({
  model: myOpenAI("gpt-4o-mini-tts"),
  text: "Hello!",
  voice: "alloy",
})

The factory returns a function that accepts an optional model ID. Call it without arguments to use the provider's default model:

const openai = createOpenAI({ apiKey: "sk-..." })

// Uses default model (gpt-4o-mini-tts)
generateSpeech({ model: openai(), text: "...", voice: "alloy" })

// Specify a model
generateSpeech({ model: openai("tts-1-hd"), text: "...", voice: "alloy" })

Available Factories

ImportFunction
@speech-sdk/core/openaicreateOpenAI()
@speech-sdk/core/elevenlabscreateElevenLabs()
@speech-sdk/core/deepgramcreateDeepgram()
@speech-sdk/core/cartesiacreateCartesia()
@speech-sdk/core/humecreateHume()
@speech-sdk/core/googlecreateGoogle()
@speech-sdk/core/fish-audiocreateFishAudio()
@speech-sdk/core/unreal-speechcreateUnrealSpeech()
@speech-sdk/core/murfcreateMurf()
@speech-sdk/core/resemblecreateResemble()
@speech-sdk/core/fal-aicreateFal()
@speech-sdk/core/mistralcreateMistral()

Configuration Options

All factory functions accept the same base options:

interface ProviderConfig {
  apiKey?: string // Override the env var
  baseURL?: string // Custom API endpoint (proxies, self-hosted)
  fetch?: typeof globalThis.fetch // Custom fetch implementation
}

Custom Fetch

Pass a custom fetch for logging, instrumentation, or environments without a global fetch:

import { createOpenAI } from "@speech-sdk/core/openai"

const openai = createOpenAI({
  fetch: async (url, init) => {
    console.log(`Requesting: ${url}`)
    return globalThis.fetch(url, init)
  },
})

Request Options

Every generateSpeech call accepts these options:

generateSpeech({
  model: string | ResolvedModel,    // required
  text: string,                     // required
  voice: Voice,                     // required
  providerOptions?: object,         // provider-specific API params
  maxRetries?: number,              // default: 2
  abortSignal?: AbortSignal,        // cancel the request
  headers?: Record<string, string>, // additional HTTP headers
});

Abort Signal

Cancel an in-flight request:

const controller = new AbortController()

const promise = generateSpeech({
  model: "openai/gpt-4o-mini-tts",
  text: "Hello!",
  voice: "alloy",
  abortSignal: controller.signal,
})

// Cancel after 5 seconds
setTimeout(() => controller.abort(), 5000)

Custom Headers

Pass additional HTTP headers to the provider's API:

const result = await generateSpeech({
  model: "openai/gpt-4o-mini-tts",
  text: "Hello!",
  voice: "alloy",
  headers: {
    "X-Custom-Header": "value",
  },
})

Retries

SpeechSDK retries on 5xx and network errors with exponential backoff. Does not retry 4xx errors. Default: 2 retries.

const result = await generateSpeech({
  model: "openai/gpt-4o-mini-tts",
  text: "Hello!",
  voice: "alloy",
  maxRetries: 5,
})

On this page