Day 0 support for Google Gemini 3.1 Flash TTS Try it now →
Providers

Fish Audio

Open-source Fish Audio S2 with audio tags and voice cloning.

Prefixfish-audio
Default models2-pro
Env varFISH_AUDIO_API_KEY
Official docsdocs.fish.audio

Models

ModelStreamingAudio TagsVoice CloningOpen SourceNotes
s2-proYesYesYesYesDefault; multilingual

Usage

import { generateSpeech } from "@speech-sdk/core"

const result = await generateSpeech({
  model: "fish-audio/s2-pro",
  text: "Hello from SpeechSDK!",
  voice: "reference-id-from-fish",
})

The voice string is sent to Fish Audio as reference_id.

Voice Cloning

Fish Audio supports inline voice cloning. Upload a reference audio clip to get a reference_id in the Fish Audio console, or use a built-in reference voice. See Voice Cloning.

Audio Tags

s2-pro supports audio tags — bracket tags in your text are passed through to the model.

await generateSpeech({
  model: "fish-audio/s2-pro",
  text: "[laugh] That's a great joke!",
  voice: "reference-id-from-fish",
})

Provider Options

await generateSpeech({
  model: "fish-audio/s2-pro",
  text: "Hello!",
  voice: "reference-id-from-fish",
  providerOptions: {
    format: "mp3",
    mp3_bitrate: 128,
    chunk_length: 200,
    normalize: true,
  },
})

Custom Configuration

import { generateSpeech } from "@speech-sdk/core"
import { createFishAudio } from "@speech-sdk/core/providers"

const fishAudio = createFishAudio({
  apiKey: process.env.FISH_AUDIO_API_KEY,
})

const result = await generateSpeech({
  model: fishAudio("s2-pro"),
  text: "Hello!",
  voice: "reference-id-from-fish",
})

On this page