INTERNATIONAL
ORGANIZATION FOR STANDARDIZATION
ORGANISATION INTERNATIONALE DE NORMALISATION
ISO/IEC JTC 1/SC 29/WG 11
CODING OF MOVING PICTURES AND AUDIO
ISO/IEC JTC 1/SC 29/WG 11N7678
October 2005, Nice
|
Title |
MPEG-2 Systems white paper – PS&TS |
|
Source |
Systems |
|
Status |
Proposal |
|
Editors |
Peter Schirling |
MPEG-1 [1] and MPEG-2 [2] are unique points in history for digital media as each represents a first of a kind. This brief paper will describe MPEG-2 Systems. The description of MPEG-1 Systems is provided in a separate brief.
The systems layers of each of these international standards are also unique in many ways. What they have in common is that the systems layer provides stream identification and synchronization information about the audio and video layers that is essential to the decoding and subsequent rendering for each of them. The audio and video are bounded domains while the systems layer is viewed as an unbounded element and can, therefore, be a very complex environment to deal with. The systems layer is essential and required to carry not only the multiplexed audio and video information but all of the other non-audio/video information, and in many cases private data needed for a successful and pleasing user experience.
When MPEG-1 video and audio were being developed it was recognized that if they were going to be rendered synchronously there needed to be a means to accomplish that. This was also true of the MPEG-2 CODECS. This problem was solved in MPEG-1 Systems for storage media such as video CD and extended in MPEG-2 Systems to accommodate broadcasting of MPEG audio and video and high capacity sotrgae devices such as DVD. As you will see below in the brief description of the MPEG Systems layer design the solution is a unique multiplex designed to precisely deliver a clock reference as well as the elementary streams in such a way as to enable audio-video synchronization. The MPEG-1 and MPEG-2 CODECS differ in many ways and so too does the systems layer for MPEG-1 and MPEG-2 that support the delivery of audio, video, and other information because the target applications differ as well.
The MPEG-2 standard is directed at broadcasting of high quality images and audio over satellite, cable or terrestrial networks. Each is prone to errors caused by different factors associated with the delivery environments. The MPEG Systems Committee had to solve these problems and ensure that there is complete interoperability amongst the various environments as well as backward compatibility with MPEG-1 Systems.
The design principal behind both MPEG-1 and MPEG-2 Systems is based on an on-time, error free delivery of a stream of bytes. The time interval (bit rate) between a byte leaving the transmitter is the same as the time interval at its arrival at the receiver. By maintaining this constant delay it is possible to imbed an encoder clock reference in the byte stream that can be used by the receiver to control a clock reference in the demultiplexor/decoder complex. This clock reference is used to pace the decoding of the audio-video information thus keeping them in sync. As a part of this timing model all audio and video samples are intended to be presented once unless specified otherwise.
Unlike todays Internet environment where audio and video can be delivered separately using different sockets, both the optical disc and broadcast environments are defined to use a single stream (in-band) of information (bytes) over which both audio and video (and all other information) are delivered. Further constraints are present because optical disks (CD or DVD) are constant speed devices while audio or video information may be variable rate.
Therefore, MPEG-2 Program Stream contains bitrate information in order to restore the bitrate intended when the content was encoded and is backward compatible with the MPEG-1 SYSMUX. MPEG-2 Transport Stream, on the other hand, does not contain these values because the bitrate of the stream is established at demultiplexor and extracting a value becomes unnecessary. Other differences between MPEG-1 and MPEG-2 are highlighted below.
The most important innovation introduced by MPEG-1 and carried over and extended into MPEG-2 Systems is the specification of a System Target Decoder (STD) model. The STD is an idealized model of a demultiplexing and decoding complex that precisely specifies the delivery time of each byte in an MPEG Systems multiplex and its distribution to the appropriate decoder or resource in the complex.

Figure 1 - MPEG-2 System Target Decoder
The STD is a buffer model and is normative. Thus it is used by implementers to verify that their implementation of the normative elements of the standard function correctly. The syntax of stream decoding is the other element that needs to be verified for correctness using the text of the standard. The clock recovery and buffer management are critical to the proper operation of a demultiplexing and decoding complex. The consequences of improper clock recovery manifest itself in noticeable audio and video faults such as sound glitches and picture artifacts or frozen frames. The causes of flawed or failed clock recovery are many and varied and will result in improper pacing of audio and video decoding. The net is that audio and video decode run too fast or too slow and both synchronization and quality are effected because the buffers that hold the compressed data stream either empty (too fast a pace) or overflow (too slow pacing).
Other differences between MPEG-1 Systems and MPEG-2 Systems, such as the use nextbit versus flags to indicate the presence of a field, came about in part because their development was separated in time. MPEG-1 was developed in 1992 while MPEG-2 Systems had the benefit of MPEG-1s experience and was completed in late 1994. MPEG-1 Systems and MPEG-2 Program Stream (PS) are based on the needs and characteristics of optical storage devices. The record or packet lengths are long to minimize the event handling requirements in optical drives. MPEG-2 Program Stream use a PACK containing one or more PACKETS. Each PACKET contains either audio or video elementary stream information. Figure 2 below illustrates the structure of the MPEG-2 PS PACK .

Figure 2 MPEG-2 Pack/PES Structure
MPEG-2 Transport Stream uses short packet lengths (188 bytes) because broadcast environments are highly prone to error and the loss of one or more packets. These errors may be able to be concealed or corrected by techniques designed into the receiver devices. The loss of a long packet, such as those on optical disc, is not easily concealed. However, CDs and DVDs contain error correction methods of their own designed specifically for such devices.

Figure 3 MPEG-2 Transport Structure
Also, broadcast applications need to be able to transmit more than one program
on a single RF carrier. MPEG-2 Transport Stream is capable of carrying multiple
Programs in a stream while MPEG-1 SYSMUX and MPEG-2 Program Stream can only
transport a single Program. A Program is define as all information (audio, video,
data etc) having a common clock reference. Each Program, however, is capable
of multiple audio and video elementary streams.
Target applications
MPEG-2 - Standard Definition and High Definition television broadcasting over Terrestrial, satellite and cable networks, and optical disk - specifically DVD for movie distribution.
[1] ISO/IEC 11172-1 Coding of moving pictures and associated audio for digital storage media at up to about 1,5 Mbit/s: Part 1 Systems
[2] ITU-T Rec. H.222.0 | ISO/IEC 13818-1:2000 Generic coding of moving pictures and associated audio information, Part 1: Systems