previous | contents | next

Audio profiles and levels

In order to maximize interoperability, only a small number of profiles have been defined for MPEG-4 Audio. Given the rather large number of coding tools and object types, this leads to the inclusion of a relatively large number of audio object types even in the more simple profiles. Some of the audio profiles (e.g. main profile) contain both natural and structured audio object types. The following table lists only the audio profiles containing natural audio object types:
 

An hierarchical organisation of the profiles supports the design for interoperability: The speech coding profile is contained in all the other profiles containing natural audio coding tools, the scalable audio profile is contained in the main profile. Table V shoes all the tools of MPEG-4 natural audio and their use in the different audio objects types.
 
Table V: Usage of Audio Object Types
Audio Object Types Tools GA Bitstream Syntax Type Hierachy
13818-7 main 13818-7 LC 13818-7 SSR PNS LTP TLSS TwinVQ CELP HVXC
AAC main X     X           ISO/IEC
13818-7
Style
contains AAC LC
AAC LC   X   X           ISO/IEC
13818-7
Style
 
AAC SSR     X X           ISO/IEC
13818-7
Style
 
AAC LTP   X   X X         ISO/IEC
13818-7
Style
contains AAC LC
AAC Scalable   X   X X X        
scalable
 
 
TwinVQ         X   X      
scalable
 
 
CELP               X    
 
 
 
HVXC                 X  
 
 
 

 

Levels for the MPEG-4 audio scalable profile  TOP

The large number of possibilities to combine different audio object types makes the traditional way of defining levels according to the channel count, sampling frequency etc. very difficult. In order to enable decoder implementers to conform with a certain level definition and still retain the possibility to combine different audio object types, complexity units have been defined and are used to calculate necessary decoder capabilities. For each audio object type, the decoder complexity (for a given sampling rate and channel count) was estimated in PCUs (computing complexity counted as millions of operations per second needed) and RCUs (memory complexity counted in kWords buffer requirements). Of course these complexity numbers depend a lot on the architecture of a decoder, whether realized on a dedicated DSP or a general purpose computing architecture. The following table lists the estimates of decoders for different object types as submitted to the MPEG audio group:
 
The level of a scalable profile decoder can now be determined by PCU and RCU numbers in addition to the number of channels and sampling frequencies. Four levels have been defined. They are:


Table VI: Decoder Complexity
Object Type Parameters PCU (MOPS) RCU (kWords)
AAC Main 1) fs=48 kHz 5 5
AAC LC 1) fs=48 kHz 3 3
AAC SSR 1) fs=48 kHz 4 3
AAC LTP 1) fs=48 kHz 4 4
AAC Scalable 1) 2) fs=48 kHz 5 4
TwinVQ 1) fs=24 kHz 2 3
CELP fs=8 kHz 1 1
CELP fs=16 kHz 2 1
CELP fs=8/16 kHz 3 1
HVXC fs=8 kHz 2 1
Definitions:
fs = sampling frequency
 
Notes:
1) PCU Proportional to sampling frequency
2) Includes core decoder

 
TOP
previous | contents | next