The Moving Picture Experts Group

MPEG Media Transport Moves Streams More Flexibly Than Ever

By Philip Merrill - April 2017

Unlike MPEG-2 TS or ISO Base Media File Format (BMFF), MPEG Media Transport was designed to support the emerging need for highly dynamic signal transport. Ad insertions provide the conventionally simple example since this is a conspicuous industry that spends to support personalized advertising insertions. In augmented reality, on the other hand, a user's position and direction data must update regularly while a user moves around and a server is tasked with providing fresh and appropriate metadata, relevant to real-time activities to be jointly coordinated between the server and the user.

An interesting perspective on MMT's new possibilities is provided by a look at efforts to update MPEG-2 TS and BMFF. For example in 2015 after the 112th meeting in Warsaw, the announcement of the update to BMFF was able to boast of 15 years "in continuous development." In 2013 after the 105th meeting in Vienna, MPEG-2 TS was updated to support HEVC video. Efforts such as this are ongoing and so when the consensus of experts and stakeholders becomes that TS and BMFF are becoming outmoded for certain desirable approaches, it is time to develop anew and MMT is one such result.

The encapsulation process in MMT accepts media coding without restricting or specifying how it is encoded. Combined with additional signaling metadata, the media is prepared for delivery. This includes forward error-correction for transport over challenging network environments. The way TS and BMFF support multiplex/demultiplex processing would be inefficient for new uses, but MMT does employ BMFF's hierarchical structure to identify MMT's new data. For example, MMT metadata supports sequences of IDs assigned so that packets can be well delivered and organized for consumption in the order they are expected to be consumed. As far as dynamic insertion of fresh data, MMT's workflow of operations situates demultiplexing so that it will not slow things down, helping media Assets to reach users with the routine rapidity that is now required.

Leaving behind the ongoing development of TS and BMFF to maximize whatever compatibility is possible, MMT is now set up to support both older one-way use cases like broadcast and newer two-way methods of media provision, usually over the internet. MMT delivers fundamental screen Composition Information (CI) by combining a new divLocation element for temporal metadata with the HTML5 div element to support spatial configuration of media, how the media is to be displayed at a fixed point in time. Signalling messages support one-way or two-way metadata required, and the Application Layer Forward Error Correction (AL-FEC) is available for challenging transmission environments. CI, signalling metadata, and AL-FEC are all available to support anything from one-way broadcast and server-push use models while providing improved support for the range of two-way interactions currently being innovated.

On the known side of two-way media consumption, DASH supports interoperable communication of relevant status updates from the device on which media is received and displayed. Media segment formats in MMT are common with DASH, making it easier to regulate the pushed data based on the regular updates received by the server from the consuming device. This assists packets addressed to the device from carrying inappropriate amounts of data. HD image granularity might overpower an internet connection, during congestion, while real-time network-delivery descriptions alert the server as soon as possible once a higher quality picture can be delivered. This application area is common now because of mobile video streaming, but similar transfers will support innovative media mash-ups in future two-way formats yet to be developed. A Common Media Application Format is under development by MPEG to further build out MMT's ability to provide vital metadata quickly in consumption environments that are both real-world and real-time.

Scripture describes the potentially distressing series of use cases that change as one begins at a mountaintop and then descends and returns back home. Unlike biblical times, we now deal with owners of mobile phones, cars and networked homes who would like their movies and other interactions to continue without gaps in transmission or having processing glitches on their different types of device. In rugged or vast environments, broadcast has been used for server push information and then a minimum of data can be returned to the server by the device. If someone gets in their car and wants to listen to a music playlist or have a movie on in the car, appropriate transmission formats should match what works best on the device, while the auto conceivably goes around and around a big mountain with changing reception patterns. Measured by better and worse data reception, such a drive might be a bumpy ride. Upon arriving home, our imaginary consumer goes from car to a main living area with a kitchen, bathroom and home theater. At present, room-to-room reconfigurations of data provisioning that follow, track and respond to a resident's movements have remained slightly futuristic. MMT assures that internet packets, metadata descriptions, and interactions between such a user and server could remain on a stable technological basis. Users remain in control of what to do while driving, sitting or eating and meanwhile music, video and other data interactions can continue smoothly if not seamlessly throughout, without the technology being to blame for distress.