Audio Standards
Selected Audio Standards optimized for Voice over
Standards | Bandwidth requirement in kbps |
Audio bandwidth (measured) in kHz |
Basic algorithm |
G.711 | 64 | 3,1 | PCM (Pulse Code Modulation) |
G.721 | 32 | 3,1 | ADPCM (Adaptive PCM) |
G.722 | 48, 56 or 64 | up to 7 | SB-ADPCM (Sub Band ADPCM) |
G.722.1 | 24 or 32 | up to 7 | - |
G.722.1C (Siren 14) |
24, 32 or 48 | up to 14 | - |
G.723 | 5,3 or 6,3 | 5,3 | MP-MLQ (Multipulse Maximum Likelyhood Quantization) |
G.726 | 16 or 24 | 3 | ADPCM (Adaptive PCM) |
G.727 | 32 or 40 | 3 | ADPCM (Adaptive PCM) |
G.728 | 16 | 2,4 - 3 | LD-CELP (Low Delay Code Excited Linear Prediction) |
G.729 | 8 | 3 | CS-ACELP (Conjugate Structure Algebraic Code Excited Linear Predictive) |
AAC-LD | 24, 48, 96 or 128 | 11 | MPEG4-AAC (MPEG4 Advanced Audio Coding) |
G.711
G.711 is based on the PCM (Pulse Code Modulation) method which is able to transform analogue audio signals with a sampling rate of 8 kHz into digital signals with a resolution of 8 bit and a bandwidth of 64 kbps.
G.711 is easy to implement and provides signals of a sufficient quality. Because of its high bit rat of 64 kbps, however it is not suitable for connections with low bandwidths. It is also not suitable for interference-prone networks, because of its lack of error recognising and correction mechanism.
G.721
The G.721 ITU-T recommendation defines how a sampling rate of 8 kHz is converted into a 32kbps data stream by means of the Adaptive Differential Pulse Code Modulation (ADPCM). In doing so, the difference between the adjacent signals is evaluated and the next signal is predicted.
G.722
G.722 is an audio codec adopted by the ITU in 1988.
With a sampling rate of 16 kHz signals are quantified at a resolution of 14 bit. The codec covers a bandwidth of 7 kHz and doesn't have a correction mechanism.
The data rate can be as follows:
- 48 kbps
- 56 kbps
- 64 kbps
For this compression an encoder is needed that is based on the ADPCM method.
G.722.1
G.722.1 is an audio codec developed by ( the company) Polycom Siren 7. With a sampling rate of 16 kHz and a frequency band of 7 kHz the transmission operates at 24 or 32 kbps.
In comparison to other modern codecs a transmission at 16 kbps is not implemented.
The algorithms of G.722.1 are identically to Siren 7. The only difference is the data format.
G.722.1C describes a variation with a 14 kHz frequency band and represents the monophonic version of Siren14.
G.722.2
G.722.2 is a compression technique for audio transmission which is completely different to the G.722, in terms of technology. Voice compression is implemented by means of AMR-WB (Adaptive Multirate Wideband).
This technique has a bandwidth of 7 kHz, nine selectable bit rates and a voice pause detection. Furthermore it's supported by all mobile telecommunications systems and is often used in VoIP as well as in web conferences.
G.723
G.723 is an ITU-T recommendation entitled 'Dual Rate Speech Coder for Multimedia Communication Transmitting at 5.3 and 6.4 kbps'. However the delay of 67-97 ms is very long.
G.726/G.727
These are two ITU-T recommendations for different bandwidths, based on the ADPCM method. Although the voice quality is better than in G.711, it's worse than in all the other G.72x standards.
G.728
The G.728 ITU-T standard is utilised for voice compression. Its main field of application is located in VoIP.
It's characterized by a low latency. This is achieved by an estimator which is formed by the five latest samples.
Especially in terms of bandwidth it works very efficiently. It has, however, a high complexity. The quality is equal to G.726 and G.727.
G.728 has strategies for hiding frame and paket losses and is very stable regarding bit errors.
G.729
The G.729 ITU-T standard quality is comparable to the G.723. It's a so-called rainbow standard, available in 12 different variations (as of 26.03.2014). In order to ensure a correct transfer, both sides should be able to master the same variation (appendix, annex).
While the codec is optimized for the human voice other sounds are not processed sufficiently. Since speech pauses are suppressed, a so-called comfort noise needs to be generated in practical use to avoid erroneously assumed connection termination.
Its main field of application is located in VoIP.