Providers
Mistral (Voxtral TTS)
Mistral Voxtral open-source text-to-speech with voice cloning.
| Prefix | mistral |
| Default model | voxtral-mini-tts-2603 |
| Env var | MISTRAL_API_KEY |
| Official docs | docs.mistral.ai/capabilities/audio/text_to_speech/speech |
Models
| Model | Streaming | Voice Cloning | Open Source | Notes |
|---|---|---|---|---|
voxtral-mini-tts-2603 | No | Yes | Yes | Voxtral TTS |
Usage
Mistral Voxtral is voice-cloning first — there are no built-in voice IDs. Every request passes a reference audio clip:
import { generateSpeech } from "@speech-sdk/core"
const result = await generateSpeech({
model: "mistral/voxtral-mini-tts-2603",
text: "Hello from SpeechSDK!",
voice: { audio: "base64-encoded-audio..." },
})You can also pass a Uint8Array:
import { readFileSync } from "fs"
await generateSpeech({
model: "mistral/voxtral-mini-tts-2603",
text: "Hello!",
voice: { audio: readFileSync("./reference.wav") },
})See Voice Cloning for the full Voice type.
Provider Options
await generateSpeech({
model: "mistral/voxtral-mini-tts-2603",
text: "Hello!",
voice: { audio: "..." },
providerOptions: {
response_format: "mp3", // mp3 | opus | wav
},
})Custom Configuration
import { generateSpeech } from "@speech-sdk/core"
import { createMistral } from "@speech-sdk/core/providers"
const mistral = createMistral({
apiKey: process.env.MISTRAL_API_KEY,
})
const result = await generateSpeech({
model: mistral(),
text: "Hello!",
voice: { audio: "base64-encoded-audio..." },
})