Question 1

What languages does Mictoo support?

Accepted Answer

50+ languages including (alphabetical): Afrikaans, Arabic, Bulgarian, Catalan, Chinese (Mandarin and Cantonese), Croatian, Czech, Danish, Dutch, English, Estonian, Finnish, French, German, Greek, Hebrew, Hindi, Hungarian, Indonesian, Italian, Japanese, Korean, Latvian, Lithuanian, Malay, Norwegian, Persian (Farsi), Polish, Portuguese, Romanian, Russian, Slovak, Slovenian, Spanish, Swahili, Swedish, Tagalog, Tamil, Thai, Turkish, Ukrainian, Urdu, Vietnamese, Welsh.

Question 2

How does auto-detect work?

Accepted Answer

Whisper samples the first few seconds of speech to identify the language, then transcribes the whole file in that language. Works for most audio. For very short clips, audio with long non-speech intros, or files that switch languages early, picking the language manually is more reliable.

Question 3

Does Mictoo handle code-switching (multiple languages in one recording)?

Accepted Answer

Yes. Whisper was trained on a lot of code-switching audio, especially Spanish-English, Mandarin-English, French-Arabic. For audio that switches mid-recording, leave auto-detect on and Whisper will follow.

Question 4

Will the transcript translate the audio to English?

Accepted Answer

No, by default. Whisper transcribes in the source language (a French audio gives you French text). For translation, paste the transcript into DeepL or ChatGPT.

Question 5

Does Whisper have a "translate to English" mode?

Accepted Answer

Yes, the underlying Whisper model supports translation, but Mictoo currently only exposes transcription in the source language. We are evaluating whether to add a translation toggle to the UI.

Question 6

How accurate is non-English transcription?

Accepted Answer

For the major world languages (Spanish, French, German, Mandarin, Japanese, Portuguese, Arabic, Russian), 90 to 96 percent accuracy on clean audio, similar to English. For less common languages (Welsh, Maltese, Basque, Swahili), accuracy drops to 80 to 90 percent.

Question 7

Will diacritics, accents, and non-Latin scripts come back correctly?

Accepted Answer

Yes. French accents, German umlauts, Spanish ñ, Vietnamese tones, Mandarin characters, Japanese hiragana/katakana/kanji, Korean hangul, Arabic right-to-left script, Cyrillic, Devanagari, Thai script. All in their proper forms.

Question 8

My audio is in a language not on your list. Will it work?

Accepted Answer

Probably, with reduced accuracy. Whisper has basic support for many more languages than the 50+ that are fully covered. Try it. If the result is unusable, the language is outside the model training.

Question 9

Can I transcribe a podcast that switches between English and another language each segment?

Accepted Answer

Yes. Auto-detect handles segment-by-segment language changes well, especially between languages Whisper has seen often together.

Question 10

Will I get speaker labels for multilingual interviews?

Accepted Answer

Not automatically. Whisper does not do speaker diarization. Add speaker labels manually based on conversation flow.

Question 11

How do I download a multilingual transcript?

Accepted Answer

Same as for any transcript. TXT for plain text, SRT for subtitles. Both formats preserve the original script and direction (right-to-left for Arabic, Hebrew, Persian).

Question 12

Will multilingual audio be stored on your servers?

Accepted Answer

No. The file streams to our transcription provider (Groq, with OpenAI as backup), gets processed, then is discarded.

Multilingual Transcription
Free AI Tool for 50+ Languages

How it works

Drop the audio in any supported language

AI detects and transcribes

Copy, download, or edit

Why Mictoo for multilingual audio

50+ languages, all the same engine

Auto-detect handles most cases

Code-switching is supported

Right-to-left scripts work

Diacritics, tones, and CJK characters all correct

No file is stored

Where multilingual transcription helps

International interviews and ethnographic research

Cross-border business calls

Bilingual podcasts

Conference recordings with international speakers

Documentation of immigrant communities and minority languages

Pro tips for multilingual transcription

For short audio (under 30 seconds), pick the language manually

For audio that opens with non-speech (music, silence), pick the language manually

For predominantly one language with foreign-word inserts, pick the dominant language

For audio that genuinely switches between two languages, auto-detect handles it

Translation is a separate step

For rare or low-resource languages, accuracy varies

Frequently asked questions

Ready to transcribe?

Multilingual TranscriptionFree AI Tool for 50+ Languages