Get started with the FishAudio client in less than a minute:
from fishaudio import FishAudiofrom fishaudio.utils import play, save# Initialize client (reads from FISH_API_KEY environment variable)client = FishAudio()# Generate and play audioaudio = client.tts.convert(text="Hello, playing from Fish Audio!")play(audio)# Generate and save audioaudio = client.tts.convert(text="Saving this audio to a file!")save(audio, "output.mp3")
from fishaudio import FishAudiofrom fishaudio.utils import playclient = FishAudio()# With a specific voiceaudio = client.tts.convert( text="Custom voice", reference_id="bf322df2096a46f18c579d0baa36f41d" # Adrian)play(audio)
from fishaudio import FishAudiofrom fishaudio.utils import playclient = FishAudio()# With speed controlaudio = client.tts.convert( text="I'm talking pretty fast, is this still too slow?", speed=1.5 # 1.5x speed)play(audio)
Create reusable configurations with TTSConfig. Prosody controls speech characteristics like speed and volume:
from fishaudio import FishAudiofrom fishaudio.types import TTSConfig, Prosodyfrom fishaudio.utils import playclient = FishAudio()# Define config oncemy_config = TTSConfig( prosody=Prosody(speed=1.2, volume=-5), reference_id="933563129e564b19a115bedd57b7406a", # Sarah format="wav", latency="balanced")# Reuse across multiple generationsaudio1 = client.tts.convert(text="Welcome to our product demonstration.", config=my_config)audio2 = client.tts.convert(text="Let me show you the key features.", config=my_config)audio3 = client.tts.convert(text="Thank you for watching this tutorial.", config=my_config)play(audio1)play(audio2)play(audio3)
For chunk-by-chunk processing, use stream() which returns an AudioStream (iterable). For real-time streaming with dynamic text, see Real-time Streaming below.
Stream dynamically generated text for conversational AI and live applications. Perfect for integrating with LLM streaming responses, live captions, and chatbot interactions:
from fishaudio import FishAudiofrom fishaudio.utils import playclient = FishAudio()# Stream dynamically generated text (e.g., from LLM)def text_chunks(): yield "Hello, " yield "this is " yield "streaming text!"audio_stream = client.tts.stream_websocket( text_chunks(), latency="balanced")play(audio_stream)
The SDK includes helpful utilities (requires utils extra):
from fishaudio.utils import save, play, stream# Save audio to filesave(audio, "output.mp3")# Play audio (automatically detects environment)play(audio) # Works in Jupyter, regular Python, etc.# Stream audio in real-time (requires mpv)stream(audio_iterator)