MPEG aims at standardizing coding solutions for the digital representation of immersive audio and visual content. The objective is to support immersive applications - virtual and augmented reality - with the highest level of perceived audio/visual realism and with visual comfort with parallax cues that are required in natural vision.
Since any digital capture will subsample the light field in which we are immersed, various capture technologies (omnidirectional and plenoptic cameras, camera arrays, etc.) as well as display technologies (head mounted devices, light field displays, integral photography displays, etc.) lead to a wide range of coding technologies to be explored. Some will be selected for further standardization.
Representation of immersive audio for 6DoF will be based on audio object, channel bed and Higher Order Ambisonics (HOA) signals, all at positions in the VR space relative to the user. MPEG-I Immersive Audio technology will be the coded audio and metadata for transmission or storage; and decoding and rendering for presentation to the user via headphones or possibly loudspeakers. Rendering supports full 6DoF user movement and also audio elements responsive to user interaction. It is expected that the standard will be ready in 2022.
MPEG has already standardized 360° video - also called omnidirectional video - in OMAF version 1, and is in the process of standardizing extensions in 3DoF+ OMAF version 2 by mid-2020, bringing natural vision parallax cues, albeit within a limited range of viewer motion. Full parallax of dynamic object is supported in the first version of Point Cloud Coding ready by early 2020, while coding of 6DoF virtual reality over large navigation volumes is further studied for standardization by 2023.
This workshop will cover the MPEG-I immersive Audio and Visual activities - past, present and future – calling participants to present demos and future requirements to the MPEG community.
This workshop will be publicly accessible with registration.
Title: Coding technologies to be standardized for immersive audio/visual experiences
Date: 10 July, 2019
Address: MPEG meeting venue
Clarion Post Hotel
Drottningtorget 10
411 03 Gothenburg, Sweden
1300-1315 |
Introduction (Lu Yu, Zhejiang University) |
1315-1350 |
Overview of technologies for immersive visual experiences: capture, processing, compression, standardization and display (Marek Domanski, Poznan University of Technology) |
1350-1425 |
Usecases and challenges about user immersive experiences (Valerie Allie, Technicolor) |
1425-1450 |
Demos: (NHK: integral photography display (waiting for formal approval), Technicolor: Realtime interactive demo with 3DoF+ content, Tsinghua University: Plenoptic 2.0 video camera, Poznan University of Technology: Low cost virtual navigation in 6DoF) |
1450-1500 |
Coffee break |
1500-1530 |
6DoF Immersive Audio (Schuyler Quackenbush, Audio Research Labs) |
1530-1605 |
360° and 3DoF+ video, (Bart Kroon, Philips) |
1605-1640 |
Point cloud compression, (Marius Preda, Telecom SudParis, CNRS Samovar) |
1640-1720 |
How can we achieve 6DoF video compression? (Joel Jung, Orange) |
1720-1800 |
How can we achieve lenslet video compression? (Xin Jin, Tsinghua University) |
1800-1830 |
Discussion |