Day 0 support for Google Gemini 3.1 Flash TTS Try it now →
Providers

Mistral (Voxtral TTS)

Mistral Voxtral open-source text-to-speech with voice cloning.

Prefixmistral
Default modelvoxtral-mini-tts-2603
Env varMISTRAL_API_KEY
Official docsdocs.mistral.ai/capabilities/audio/text_to_speech/speech

Models

ModelStreamingVoice CloningOpen SourceNotes
voxtral-mini-tts-2603NoYesYesVoxtral TTS

Usage

Mistral Voxtral is voice-cloning first — there are no built-in voice IDs. Every request passes a reference audio clip:

import { generateSpeech } from "@speech-sdk/core"

const result = await generateSpeech({
  model: "mistral/voxtral-mini-tts-2603",
  text: "Hello from SpeechSDK!",
  voice: { audio: "base64-encoded-audio..." },
})

You can also pass a Uint8Array:

import { readFileSync } from "fs"

await generateSpeech({
  model: "mistral/voxtral-mini-tts-2603",
  text: "Hello!",
  voice: { audio: readFileSync("./reference.wav") },
})

See Voice Cloning for the full Voice type.

Provider Options

await generateSpeech({
  model: "mistral/voxtral-mini-tts-2603",
  text: "Hello!",
  voice: { audio: "..." },
  providerOptions: {
    response_format: "mp3", // mp3 | opus | wav
  },
})

Custom Configuration

import { generateSpeech } from "@speech-sdk/core"
import { createMistral } from "@speech-sdk/core/providers"

const mistral = createMistral({
  apiKey: process.env.MISTRAL_API_KEY,
})

const result = await generateSpeech({
  model: mistral(),
  text: "Hello!",
  voice: { audio: "base64-encoded-audio..." },
})

On this page