Skip to main content

Documentation Index

Fetch the complete documentation index at: https://hanabiaiinc-codex-concurrency-tier-update.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

Overview

Transform any audio recording into text with Fish Audio’s speech recognition. Perfect for transcriptions, subtitles, and voice commands.

Getting Started

Web Interface

Transcribe audio instantly:
1

Visit Fish Audio

Go to fish.audio and log in
2

Navigate to Transcribe

Click on “Speech to Text” in your dashboard
3

Upload Audio

Select your audio file (MP3, WAV, M4A)
4

Get Transcription

Click “Transcribe” and copy your text

Supported Formats

Audio Files

Accepted formats:
  • MP3 (recommended)
  • WAV
  • M4A
  • OGG
  • FLAC
  • AAC
File requirements:
  • Maximum size: 20MB
  • Maximum duration: 60 minutes
  • Minimum duration: 1 second

Language Support

Automatic Detection

The system automatically detects the language spoken in your audio. No configuration needed!

Manual Selection

For better accuracy, specify the language: Major Languages:
  • English (en)
  • Chinese (zh)
  • Japanese (ja)
With additional languages to be supported soon!

Audio Quality Tips

For Best Results

Recording Environment:
  • Quiet room with minimal echo
  • No background music
  • Clear, consistent speaking voice
  • One speaker at a time
Audio Settings:
  • Sample rate: 16kHz or higher
  • Bit rate: 128kbps or higher
  • Mono or stereo (mono preferred)

Common Issues

Poor transcription quality?
  • Remove background noise
  • Increase microphone volume
  • Speak clearly and not too fast
  • Avoid multiple speakers talking over each other

Use Cases

Meeting Transcription

Convert recorded meetings into searchable text:
  1. Record your meeting (Zoom, Teams, etc.)
  2. Export the audio file
  3. Upload to Fish Audio
  4. Get formatted transcription with timestamps

Podcast Transcripts

Create written versions of your podcasts:
  • Generate show notes automatically
  • Create searchable content
  • Improve accessibility
  • Enable translations

Video Subtitles

Generate subtitles for your videos:
  1. Extract audio from video
  2. Transcribe with Fish Audio
  3. Get timestamped text
  4. Import into video editor

Voice Notes

Convert voice memos to text:
  • Dictate ideas quickly
  • Transcribe later for editing
  • Search through voice notes
  • Share as text documents

Advanced Features

Timestamps

Get precise timing for each spoken segment:
[00:00:00] Welcome to our podcast.
[00:00:03] Today we're discussing AI technology.
[00:00:07] Let's dive right in.
Perfect for:
  • Creating subtitles
  • Navigating long recordings
  • Synchronizing with video
  • Building searchable archives

Speaker Detection

Identify different speakers in conversations:
Speaker 1: "What do you think about the proposal?"
Speaker 2: "I think it has potential."
Speaker 1: "Let's discuss the details."

Punctuation & Formatting

Automatic formatting includes:
  • Sentence capitalization
  • Punctuation marks
  • Paragraph breaks
  • Number formatting

Tips for Different Content

Interviews

Best practices:
  • Use a good microphone for each speaker
  • Record in a quiet environment
  • Speak one at a time
  • Keep consistent volume levels

Lectures & Presentations

Optimize for:
  • Clear articulation of technical terms
  • Pause between topics
  • Repeat important points
  • Avoid reading too fast

Phone Calls

Considerations:
  • Phone audio is lower quality
  • Expect slightly lower accuracy
  • Speak clearly and slowly
  • Avoid speakerphone if possible

Accuracy Expectations

What Affects Accuracy

Positive factors:
  • Clear audio quality
  • Native speaker accent
  • Common vocabulary
  • Single speaker
Challenging factors:
  • Heavy accents
  • Technical jargon
  • Multiple speakers
  • Background noise

Typical Accuracy Rates

  • Professional recording: 95-98%
  • Clean amateur recording: 90-95%
  • Phone/video calls: 85-90%
  • Noisy environments: 75-85%

Post-Processing Tips

Editing Transcriptions

After transcription:
  1. Review for accuracy - Check names and technical terms
  2. Add formatting - Break into paragraphs
  3. Correct errors - Fix any misheard words
  4. Add context - Include speaker names

Export Options

Save your transcriptions as:
  • Plain text (.txt)
  • Word document (.docx)
  • Subtitle file (.srt)
  • PDF document

Common Applications

Business

  • Meeting minutes
  • Interview transcripts
  • Call recordings
  • Training materials

Education

  • Lecture notes
  • Research interviews
  • Student recordings
  • Language learning

Content Creation

  • Video scripts
  • Podcast show notes
  • Social media captions
  • Blog post drafts

Accessibility

  • Hearing impaired support
  • Multi-language content
  • Searchable archives
  • Documentation

Troubleshooting

No Text Output

Check:
  • Audio file isn’t corrupted
  • File format is supported
  • Audio contains speech
  • Volume is audible

Incorrect Language

Solutions:
  • Manually select the correct language
  • Ensure majority of audio is in one language
  • Separate multi-language content

Missing Words

Common causes:
  • Speaking too fast
  • Mumbling or unclear speech
  • Technical terms not recognized
  • Very quiet sections

Privacy & Security

Your Data

  • Audio files are processed securely
  • Transcriptions are private to your account
  • Files are not used for training
  • Delete anytime from your account

Sensitive Content

For confidential audio:
  • Use on-premise solutions if available
  • Review privacy policy
  • Consider redacting sensitive information
  • Download and delete after processing

Best Practices Summary

  1. Start with quality audio - Good input = good output
  2. Choose the right environment - Quiet spaces work best
  3. Speak clearly - Articulate and consistent pace
  4. Review and edit - All transcriptions benefit from review
  5. Use appropriate tools - Different content needs different approaches

Get Support

Need help with transcription?