INTERNATIONAL ORGANIZATION FOR STANDARDIZATION
ORGANISATION INTERNATIONALE NORMALISATION
ISO/IEC JTC 1/SC 29/WG 11
CODING OF MOVING PICTURES AND AUDIO

ISO/IEC JTC 1/SC 29/WG 11N7679

October 2005, Nice

Title

MPEG-1 Systems white paper – Terminal Architecture

Source

Systems

Status

Proposal

Editors

Peter Schirling

Introduction

MPEG-1 [1] and MPEG-2 [2] are unique points in history for digital media as each represents a first of a kind. This brief paper will describe MPEG-1 Systems. The description of MPEG-2 Systems is provided in a separate brief.

The systems layers of each of these international standards are also unique in many ways. What they have in common is that the systems layer provides the information about the audio and video layers with stream identification and synchronization information essential to the decoding and subsequent rendering for each of them. The audio and video are bounded while the systems layer is viewed as an unbounded element and can, therefore, be a very complex environment. The systems layer is required to carry not only the multiplexed audio and video information but all of the other non-audio/video information, and in many cases private, data needed for a successful and pleasing user experience.

Background: application interoperability

When MPEG-1 video and audio were being developed it was recognized that if they were going to be rendered synchronously there needed to be a means to accomplish that. This problem had never been tackled before because audio and video in the compressed digital domain had never existed together before. As you will see below in the brief description of the MPEG Systems layer design the solution is a unique multiplex designed to precisely deliver a clock reference and elementary streams in such a way as to enable audio‑video synchronization.

MPEG-1 is intended for use in relatively error free environment of CD’s or optical discs.

MPEG Systems design

The design principal behind both MPEG-1 Systems is based on an on-time, error free delivery of a stream of bytes. The time interval (bit rate) between a byte leaving the transmitter is the same as the time interval at its arrival at the receiver. By maintaining this constant delay it is possible to imbed a clock in the byte stream that can be used by the receiver to control a clock reference in the receiver. This clock can then be used to pace the decoding of the audio-video information keeping them in sync. As a part of this timing model all audio and video samples are intended to be presented once unless specified otherwise.

Unlike today’s Internet environment where audio and video can be delivered separately using different sockets, the optical disc environment is defined to use a single stream (in-band) of information (bytes) over which both audio and video (and all other information) are delivered. Further constraints are present because optical disks (CD or DVD) are constant speed devices while audio or video information may be variable rate.

The MPEG-1 SYSMUX, as it is called contains bitrate information in order to restore the bitrate intended when the content was encoded (compressed).

The most important innovation introduced by MPEG-1 and carried over and extended into MPEG-2 Systems is the specification of a System Target Decoder (STD) model. The STD is an idealized model of a demultiplexing and decoding complex that precisely specifies the delivery time of each byte in an MPEG Systems multiplex and its distribution to the appropriate decoder or resource in the complex.  

Figure 1 - MPEG-2 System Target Decoder

MPEG-2 supports two methods of multiplexing. MPEG-2 Program Stream (PS) is based on the needs and characteristics of optical storage devices though it is not limited to such use and is the backward compatible successor to MPEG-1 SYSMUX.  The typical terminal architecture is shown in Figure 2 below. The MPEG-2 Transport Stream design was established to satisfy the needs associated with broadcasting over terrestrial networks as well as satellite and cable networks and it’s typical terminal architecture is shown in figure 3 below. While MPEG only specifies the demultiplexing and decoding behavior, along with stream syntax, it is essential that implementers understand how these are an integral part of a terminal or device that can render both the audio and video once they are decompressed. It is also important to understand what the input to a receiver complex consists of.  Figure 2 below illustrates the scope of MPEG-2 Systems as a terminal architecture. The input to a Program Stream demultiplexor is a byte stream retrieved from the media in the player from which all non-MPEG stream information has been removed so the MPEG demultiplexor is only presented with a compliant MPEG-2 Program Stream. If a PACK HEADER is detected the bitrate field is used to establish the assigned bitrate.

Figure 2 - MPEG-2 Prototypical program demultiplexing and decoding terminal

Figure 3  - Prototypical transport demultiplexing and decoding terminal

The Transport Stream demultiplexor is also a byte stream derived from an RF decoder. The RF decoder removes any forward error correction information after applying it if necessary and pushes the data to the system decoder which proceeds to separate the audio, video, and systems information and sends it to the appropriate element in the complex for processing.

Likewise, the decoded (decompressed) audio and video data are sent on for rendering into pictures and sound. If the media has been authored correctly and the data retrieval does not produce uncorrectable data errors and the decoding complex is operating correctly, the video and audio will play in sync and with the sound and picture quality at that processed when it was compressed.

MPEG-1 SYSMUX (and MPEG-2 Program Stream) can only transport a single Program. A Program is define as all information (audio, video, data etc) having a common clock reference. Each Program, however, is capable of multiple audio and video elementary streams.

Target applications

MPEG-2 -  Standard Definition and High Definition television broadcasting over Terrestrial, satellite and cable networks, and optical disk - specifically DVD for movie distribution.

References

[1] ISO/IEC 11172-1 Coding of moving pictures and associated audio for digital storage media at up to about 1,5 Mbit/s: Part 1 Systems

[2] ITU-T Rec. H.222.0 | ISO/IEC 13818-1:2000 Generic coding of moving pictures and associated audio information, Part 1: Systems