INTERNATIONAL ORGANIZATION FOR STANDARDIZATION
ORGANISATION INTERNATIONALE NORMALISATION
ISO/IEC JTC 1/SC 29/WG 11
CODING OF MOVING PICTURES AND AUDIO
ISO/IEC JTC 1/SC 29/WG 11N9580
January 2008, Antalya, Turkey
Title: Introduction to Multiview Video Coding
Source: Video Subgroup
Editor: Aljoscha Smolic
Status: Approved
3D video (3DV) and free viewpoint video (FVV) are new types of visual media that expand the user’s experience beyond what is offered by 2D video. 3DV offers a 3D depth impression of the observed scenery, while FVV allows for an interactive selection of viewpoint and direction within a certain operating range. A common element of 3DV and FVV systems is the use of multiple views of the same scene that are transmitted to the user.
Multiview Video Coding (MVC, ISO/IEC 14496-10:2008 Amendment 1) is an extension of the Advanced Video Coding (AVC) standard that provides efficient coding of such multiview video. The overall structure of MVC defining the interfaces is illustrated in the figure below. The encoder receives N temporally synchronized video streams and generates one bitstream. The decoder receives the bitstream, decodes and outputs the N video signals.

Multiview Video Coding (MVC)
Multiview video contains a large amount of inter-view statistical dependencies, since all cameras capture the same scene from different viewpoints. Therefore, combined temporal and inter-view prediction is the key for efficient MVC. As illustrated in the figure below a picture of a certain camera can not only be predicted from temporally related pictures of the same camera. Also pictures of neighboring cameras can be used for efficient prediction.

Temporal/inter-view prediction structure for MVC.