When talking about Exchange Server 2007 Unified Messaging (UM), we often get a question: "Just how big will those messages be?"
The size of UM voice messages depends on the size of the attachment that holds the voice data. In turn, the size of the attachment depends on three factors:
(1) the duration of the recording
(2) the audio codec used
(3) the audio storage format
UM uses one of three codecs for creating voice messages: WMA (Windows Media Audio), GSM 6.10 and PCM Linear. The WMA codec is always stored in Windows Media format (the attachment is a file with a .wma extension). Audio encoded as GSM or PCM is always stored in RIFF/WAVE format (the attachment is a file with a .wav extension).
The graph below shows how the size of the audio depends on the duration, for the three codecs used:
PCM is uncompressed, and therefore occupies the most space at a given duration (just over 160,000 bytes for each 10 seconds of audio). It has the highest audio quality of the three. However, WMA and GSM are both acceptable to the vast majority of listeners.
GSM is compressed (just over 16,000 bytes for each 10 seconds).
WMA is the most highly compressed codec (about 11,000 bytes for each 10 seconds). However, the WMA format has a much larger header section than the WAV format (about 7K, compared to less than 100 bytes). WMA recordings become smaller than GSM recordings for durations of about 15 seconds and above. The average call-answered voice message is about 30 seconds long.
WMA is the default setting. GSM or PCM can be used where interoperability with other platforms is of great importance (the WAV format and GSM codec are widely supported).