INTERNATIONAL ORGANISATION FOR STANDARDISATION
ORGANISATION INTERNATIONALE DE NORMALISATION
ISO/IEC JTC1/SC29/WG11

CODING OF MOVING PICTURES AND AUDIO

ISO/IEC JTC1/SC29/WG11 M7508
Poznan, Poland, July 2005

Title:          White paper on ISO/IEC 14496-18 "Font compression and streaming"
Editor:        Vladimir Levantovsky
Status:       Proposal

Introduction

The multimedia encoding and presentation technology specified in the suite of MPEG-4 (ISO/IEC 14496) standards describes the means to create an interactive audio-visual scene in terms of coded audio-visual objects and associated scene description information [1, 3]. The encoded content is presented to a terminal as the collection of elementary streams, which are decoded using their respective stream-specific decoders. The audio-visual objects are composed according to the scene description information and presented by the terminal’s presentation device(s).

The scene description stream identifies different types of objects, such as audio, visual, 2D and 3D graphics, etc. that define a scene composition of the content. Among these objects, the essential part of almost any multimedia presentation is text objects that are created utilizing specific fonts. Font selection determines the appearance of a text in multimedia content and it’s the most critical factor that assures content layout, legibility and readability. It also plays critical role in the overall scene composition since the metric properties of a font are used for textual parts of multimedia content layout.

Many thousands of fonts are available today for use in content creation; advertisements and commercial presentations are often created utilizing custom design fonts that may not be available on a remote terminal. The multimedia presentations can also be created in many different languages using character sets that may not be supported by resident device fonts. In order to insure faithful appearance and layout of content, the font data have to be included (embedded) with the text objects as part of the multimedia presentation.

Description of technology

MPEG-4 part 18 "Font compression and streaming" defines and provides the following technologies:

Fonts define the text presentation and appearance; whether it is a commercial presentation, a game, a movie or a newscast – fonts create a particular mood and feel that is desired for a given multimedia content. The technology defined by the standard allows content creators to design their presentations utilizing any font that suits their purposes best (Fig.1). Whether or not the selected fonts are available in the MPEG-4 terminal, the necessary font data will be delivered to a terminal to guarantee that the presentation will be rendered and displayed faithfully, according to the original design and intent of the content creator.

  The selection of the industry standard OpenType and TrueType font formats provides additional advantages for both content authoring and OEM:

Target Applications

Font streaming facilitates the development and deployment of the diverse set of applications, including but not limited to:

References

[1]   ISO/IEC 14496-11, Coding of audio-visual objects, Part 11: Scene description and Application engine.

[2]   ISO/IEC 14496-18, Coding of audio-visual objects, Part 18: Font compression and streaming

[3]   ISO/IEC 14496-20, Coding of audio-visual objects, Part 20: Lightweight Scene Representation (LASER)

[4]   OpenType specification (http://www.microsoft.com/typography/otspec/default.htm)


 

[i] OpenType is a registered trademark of Microsoft Corporation.

[ii] TrueType is a trademark of Apple Computer Incorporated.

[iii] MicroType is a registered trademark of Monotype Imaging Inc.

[iv] Courtesy of Alcatel, Monotype Imaging and Streamezzo.