The Moving Picture Experts Group

MPEG-V enhances virtual worlds and is a passport between them

by Philip Merrill - July 2017

The game room kids dream of having would make the imaginary real, and MPEG-V provides a way to sense what is going on in the room and control effects-devices accordingly. Movies and games can be enhanced in an immersive environment with special playback features, more like an amusement park ride than what consumers have been able to afford, and these are bound to become more common. Whether goals are fitness, meditation, thrills, or other forms of mental escape, gimmicks-that-matter are coming that do real things in the real world to enhance the immersive experiences of being in either virtual worlds or mixed reality settings.

The environment within a working MPEG-V media processor, called an adaptation engine, requires virtual world information in an interoperable format along with a system so that sensors and actuators can be well managed by digital signals. The sensors, the actuators, as well as the user must be represented somewhere in the multiplexed signal. On the virtual world side, proprietary systems can use an anything-goes approach, but the value of a standard requires one size to fit all for interoperability. With MPEG-V that means representations of metadata pertaining to the sensors and actuators as well as a summary format of virtual characteristics. With adjustments, any proprietary virtual world should be able to be converted into MPEG-V's Virtual World Object Characteristics. These enable avatars and generic objects to travel between virtual worlds. But applications of MPEG-V more focused on the sensors and actuators give a good look at the unusual options available blending mixed reality with real-world devices that can either signal measurements or receive commands to perform a set of tasks, such as emitting certain odors.

Arcade and theme park chairs have shaken for years accompanied by sensorial effects. It is now expected that home theater systems will all soon be populated by light systems, temperature controls, wind fans, shaking, spraying, smells, mist, coloration adjustments, and more skin-and-bones oriented devices permitting motion, kinesthetic and tactile effects. On the other hand, game rooms are an increasingly meaningful investment for young couples who grew up gaming and enjoy playing with friends, so sensorial effects could enhance immersion in games. Movies and television broadcasts can use MPEG-V's datatypes and architecture to accompany media with enhanced information, a sort of metadata soundtrack that can be digitized in an MPEG-2 TS multiplexed signal for transmission. Once demultiplexed, parsed, and restored to usable XML, this virtual system enables control of actuators based on information received from sensors. While this is not designed to create a robot, similarities to the human nerve paths are suggestive, this is designed to create an avatar with a passport, ready to travel between virtual worlds because its characteristics are adequately represented.

As described above, the Virtual World Object Characteristics provide a comprehensive set of metadata descriptors to embody what is needed in the virtual or the virtual-to-virtual realm, which is the same thing, and is covered by the standard. In order to operate with the sensors, actuators and user, this virtual side receives sensed information as metadata and sends sensorial effects metadata to the adaptation engine. MPEG-V equips this box with the metadata descriptions it needs to make good adaptations from the real world to the virtual world, or vice versa from the virtual world to the real world. How the adaptation engine performs its processing is left unspecified. MPEG-V reference software might assist someone experimenting with designs for adaptation engines in general, but crafting some simple process for a few typical tasks is straightforward and has been demonstrated. Within the environment of the adaptation engine, the metadata must be adequate to support the virtual world as well as sensor, actuator and user requirements. The sensors and actuators must be enabled to report on their capabilities and the user's unique preferences regarding these capabilities must be represented. If the kids want the wind machine to be on high while their parents prefer a mild breeze, this has to be in the data and the adaptation engine will make use of it, issuing appropriate commands to relevant actuators.

Sensed information must represent whatever signal is sent by the sensor as appropriate metadata and this must be in a format suitable for the media processor and its adaptive interactions with the virtual world. For example a man with a job and a child could use an intelligent camera to record, analyze and virtualize his facial expressions, having his avatar of himself wearing a suit and tie to communicate virtually with his office, followed by switching avatars to a beloved cartoon image to have fun interacting in real time with his child. This exemplifies the real-to-virtual flow. An additional feature is compatibility with MPEG-U interoperable widgets and advanced interfaces.

Considerably more is possible with real-to-virtual and virtual-to-real dataflows in use simultaneously, mediated by the adaptation engine. Experiencing a virtual room within a real room suggests simple combinations, for example to adjust the virtual scene in response to high levels of sound or light in the real world. With mixed reality, different fitness and entertainment environments are being discovered and invented, so many almost empty spaces can be enhanced by mixed reality virtual objects as well as actuators which work in conjunction with relevant sensors. In games, real-world game controllers such as steering wheels or firearms have been available as peripherals for years. Game characters in virtual worlds find objects with special powers such as gems or weapons, and real world versions of these are often sold as licensed merchandise collectibles. But what is possible soon is to have objects more intelligent than a talking children's toy which will exist simultaneously in the real world with one set of properties and semi-independently embodied in Virtual World Object Characteristics. An object that changed shape after certain landmarks are achieved, whether in game play or time code, could be actuated to alter its appearance in the real world synchronously with action occurring in a virtual scene or home theater video.

MPEG-V supports a rich and extensible environment for metadata and media processing, capable of being expanded as far as our imaginations and the efforts of real world developers are interested in taking it.