Why MP3 still dominates 30 years after launch
MP3 was finalised in 1993, became practical for consumer distribution in the mid-1990s, and has been "almost replaced by newer formats" for the entire time since. AAC sounds slightly better at the same bitrate. Opus sounds noticeably better for voice. FLAC is lossless. Yet MP3 remains the default for podcast distribution, downloadable audio, and voice memo apps. The reason is universality: every device, every platform, every player plays MP3 without negotiation.
What makes MP3 well-suited to transcription
For speech content at reasonable bitrates (96-128 kbps and up), MP3 preserves enough acoustic detail that Whisper transcribes it essentially as well as uncompressed WAV. The perceptual compression model in MP3 discards information in frequencies and patterns that humans (and ASR models trained on human speech) do not rely on heavily.
Where MP3 starts to hurt transcription: very low bitrates (32 kbps and below) on noisy audio, where the encoder is already throwing away signal-relevant information to fit the bitrate budget. For clean voice at low bitrates, even 32 kbps mono usually transcribes well; for noisy or multi-speaker audio at low bitrates, accuracy drops.
Constant vs variable bitrate
MP3 supports both Constant Bitrate (CBR, every second uses the same number of bits) and Variable Bitrate (VBR, the encoder uses more bits for complex passages and fewer for quiet ones). For transcription, both work the same. The transcript quality depends on the average bitrate, not whether it varies. CBR is more predictable for file size (useful for podcast hosts with size targets); VBR is more efficient for the same quality.
Where MP3 files come from
Podcast distribution: every major podcast hosting platform (Libsyn, Buzzsprout, Anchor, RSS.com, Megaphone) accepts MP3 and most publish as MP3. The episode MP3 in your podcast app is the same file you upload here.
Voice recording apps: many phone voice recorders default to MP3 (Easy Voice Recorder on Android, Voice Recorder apps on iPhone), making MP3 a natural transcription input.
Web downloads: YouTube audio extractors often produce MP3 for compatibility, downloadable lectures from learning platforms, audiobook samples, music downloads from Bandcamp (if you chose MP3), broadcast radio archives.
ID3 tags and what we do with them
MP3 files often carry ID3 metadata: title, artist, album, year, cover art. We read the audio and ignore the tags during transcription. The transcript itself is plain text; it does not carry MP3 metadata forward. If you need the transcript matched with the original metadata (for example, "episode 42 of Show X"), keep that mapping on your side when you archive the transcripts.