INTERNATIONAL
ORGANIZATION FOR STANDARDIZATION
ORGANISATION
INTERNATIONALE DE NORMALISATION
ISO/IEC
JTC 1/SC 29/WG 11
CODING
OF MOVING PICTURES AND AUDIO
ISO/IEC JTC1/SC29/WG11
N7294
Poznań, Poland, July 2005
1
Introduction
ISO/IEC
11172-2 extends the specifications of MPEG-1 Video for more generic classes
of video sources and applications. It supports interlaced video and more rigid
display timing constraints. Encoded data rates can be up to about 40 Mbit/s
for storage and transmission, or even higher for professional applications in
video production. Larger frame sizes of up to HD resolution are supported.
2
Technical Solution
MPEG-2
is forward compatible with MPEG-1 (which means that MPEG-1 streams observing
typical constraints e.g. in frame sizes and data rates can be decoded by MPEG-2
decoders). In terms of video encoding tools, specific provisions are made for
interlaced video. Further, MPEG-2 defines tools for scalable video coding, e.g.
to embed streams which can be used to either decode with CIF+SD, SD+HD resolution.
The main technical extensions as compared to MPEG-1 can hence be summarized
as follows:
- To support the properties of interlaced
video, different methods for field/frame adaptive motion compensation (frame
based, field based and dual prime prediction modes), as
well as switching between field-/frame-wise DCT are specified. Further, it
is possible to switch into a 16x8 prediction mode, where separate motion vectors
can be defined for the top and bottom halves of a macroblock.
-
The variable-length coding (VLC)
tables were extended for better compression performance in higher data rates
and resolutions.
- Methods of scalable coding are defined,
which provide SNR scalability and spatial scalability over a limited number
of layers each. In this context, the bitstream is sub-divided into two or
three parts, which means that by retaining or receiving only core parts of
the stream, it is e.g. possible to reconstruct frames at lower resolution.
To encode DCT coefficients related to the different resolution levels, differently
optimized variable length codes are provided in the spatial scalable mode.
- Methods to encode sequences with 4:2:2 chrominance
sampling are defined by allowing additional 8x8 transform blocks to be subsumed
within a macroblock.
-
A method of temporal scalability
is defined, which allows prediction of additional inserted frames either from
a base-layer sequence or from another frame of the enhancement layer sequence.
This method can also be used for encoding of stereoscopic sequences with an
LRLRLR… interleaving of left and right pictures, as e.g. required for shutter-glass
display.
-
All scalability modes can also
be used to achieve a forward and backward compatible combination of MPEG-1
and MPEG-2, when MPEG-1 syntax is exclusively used in the base layer.
-
Methods of data partitioning
for DCT coefficients are defined, which can improve the error resilience of
video streams.
The
number of application domains and necessary combinations of elementary tools
is manifold for the case of MPEG-2. As it appears not to be useful for any MPEG-2
device to support all elements of the standard, MPEG-2 defines different
profiles. Further, within each profile, levels are specified,
which describe maximum sizes or image formats which must be decodable. Each
bitstream carries information indicating which profile capability is required
at the decoder, as well as the level of required support in the profile, to
identify the requirements of deoder capability for the bitstream. From this
information, any conforming MPEG-2 decoder can decide immediately whether it
will be able to process the bitstream. Within certain application domains, specific
'profile@level' configurations have been established as mandatory, e.g. 'Main
Profile@Main Level' is typically required for digital TV broadcast or DVD storage
applications. Four levels are defined: 'Low', 'Main' (SD), 'High-1440' and 'High'
(HD), where however not each level is combinable with each profile. The profiles
defined by MPEG-2 video are as follows:
- Simple profile:
This is for low cost and low delay applications, allowing frame
sizes up to SD resolution (ITU-R Rec. BT.601) and frame rates up to 30 Hz.
Usage of B frames is not allowed.
-
Main profile:
This is the most widely used MPEG-2 profile, defined for SD and HD resolution
applications in storage and transmission, without providing compatible decoding
of different resolutions. All interlaced-specific prediction modes as well
as B frames are supported.
-
SNR scalable profile:
Similar to Main profile, but allows in addition SNR scalability invoking drift
at the base layer; resolutions up to SD are supported.
-
Spatial scalable profile:
Allows usage of (drift free) spatial scalability, also in combination with
SNR scalability.
- High profile:
Similar to Spatial Scalable profile, but supporting a wider range of levels,
and allowing 4:2:2 chrominance sampling additionally. This profile was primarily
defined for upward/downward compatible coding between SD and HD resolutions,
allowing embedded bitstreams for up to 3 layers with a maximum of two different
spatial resolutions.
-
4:2:2 profile:
This profile extends the Main profile by allowing encoding of 4:2:2 chrominance
sampling, and allows larger picture sizes and higher data rates.
-
Multiview profile:
This allows encoding of stereoscopic sequences provided in a left-right interleaved
multiplex, using the tool of temporal scalability for exploitation of left-right
redundancies. The displacement between left and right pictures is estimated
and used for disparity compensated prediction, which follows the same
principles as motion-compensated prediction. In addition, camera parameters
can be conveyed in the stream.
The
text of the MPEG-2 Video standard is common with ITU-T Rec. H.262. Subsequent
to the second edition of the standard text which was published in 2000, the
following corrigenda and amendments are integral part of the MPEG-2 Video specification:
- ITU-T Rec.H.262(2000)/Cor.1 (2002)|ISO/IEC
13818-2:2000/Cor.1:2002
-
ITU-T Rec. H.262(2000)/Amd.1(2000)|ISO/IEC
13818-2:2000/Amd.1:2001
-
ITU-T Rec.H.262(2000)/Cor.2
(200X)|ISO/IEC 13818-2:2000/Cor.2:200X (in
preparation)
3
Application areas
MPEG-2
is mainly used for consumer-level video broadcast (e.g. DVB) and storage (e.g.
DVD), as well as for professional applications such as video storage in studios.