What an M4A file actually is
M4A is, technically, a regular MP4 file that happens to contain only an audio track. The file extension is just a convention (Apple started using .m4a to distinguish audio-only MP4s from video MP4s, so iTunes and Music could filter properly). Open the same file with the .mp4 extension and most players will treat it identically.
The audio inside an M4A is almost always AAC (Advanced Audio Coding), the codec that succeeded MP3 in efficiency. Sometimes it is ALAC (Apple Lossless), which preserves audio bit-for-bit like FLAC does. Voice Memos picks based on your Audio Quality setting. GarageBand always writes AAC for shared exports. Apple Music streaming uses AAC.
The .m4a, .m4b, .m4r, .mp4 family
Same container, different file extensions, different intent. .m4a is plain audio. .m4b adds chapter markers for audiobooks. .m4r is the same as .m4a but Apple uses the extension to mark a file as a ringtone (so iTunes and Music would put it in the right place). .mp4 with only an audio track is what some non-Apple tools write instead of .m4a. Mictoo treats them all as M4A and decodes the audio normally.
Why your iPhone Voice Memo is so small
iPhone Voice Memos default to AAC at 32 kbps mono. That works out to roughly 240 KB per minute, so a one-hour interview is about 14 MB. The same hour as WAV would be 600 MB or more. AAC achieves this by removing audio information humans cannot perceive: very high frequencies, masked sounds, redundant information across channels.
For transcription this almost never matters. Whisper transcribes 32 kbps mono AAC about as well as it transcribes uncompressed WAV of the same speech. Where AAC compression starts to lose words is in heavy background noise or very quiet speech, where the encoder may have already removed the signal Whisper needed.
AAC vs ALAC inside the M4A container
If you have Voice Memos set to Lossless, the audio inside the M4A is ALAC instead of AAC. We handle both. Transcription quality is the same. The only practical difference is file size: an ALAC Voice Memo is roughly 10-15 times larger than the AAC equivalent. For everyday voice work, stick with Lossy (AAC). For situations where the audio will be processed further later (DAW import, restoration, archival), Lossless is fine but unnecessary for transcription.