The Moving Picture Experts Group

Advanced User Interaction Interface

Part number: 
Activity status: 

MPEG Advanced User Interaction


MPEG doc#: N11967
Date: March 2011
Authors: Seong Yong Lim, Jihun Cha, Electronics and Telecommunications Research Institute (ETRI)

MPEG-U Advanced User Interaction Interface (AUI) Providing Tools for Interacting via Advanced UI devices

1         Motivation

Technology evolution in the field of user interface has been rapidly progressing for recent years. Even though the most dominant user interaction devices are still mouse, keyboards for PC and remote controllers for TV, lots of evidences for using advanced sensing technologies are published. For instance, major game device manufactures released motion-based game titles which are interfaced with a human operator via a motion sensor. On the other hand, MPEG has developed various scene description technologies such as Binary Format for Scenes (BIFS) and Lightweight Application Scene Representation (LASeR). These technologies can be widely used by industries and applications on fixed and mobile devices in order to represent a scene composed of video, audio, 2D graphics objects, and animations. However current MPEG standards mostly focus on basic interaction devices such as pointing and keying devices. It reflects that the need of describing common data formats for the above mentioned Advanced User Interaction (AUI) interfaces has been highlighted for the improvements of capabilities and the new deployment of interactive rich media services.

2         Advanced User Interaction interface (AUI)

This part of ISO/IEC 23007 specifies advanced user interaction interfaces to support various advanced user interaction devices. The AUI interface is a part of the bridge between scene descriptions and system resources. A scene description is a self-contained living entity composed of video, audio, 2D graphics objects, and animations. Through the AUI interfaces or other existing interfaces such as DOM events, a scene description accesses interesting system resources to interact with users. In general, a scene composition is conducted by a third party and remotely deployed.

Advanced user interaction devices such as motion sensors and multi touch interfaces generate the physical sensed information from user’s environment. By a recognition process, a set of physical information can be converted to a pattern with semantics which is more useful to a scene description. For instance, some feature points drawn by user’s finger can be understood as a circle which is specified with the center position of a circle and a radius value. Therefore, this part provides a set of data formats which defines geometric patterns, symbolic patterns, touch patterns, posture patterns and their composite patterns.

  1. The geometric patterns are a set of geometric shapes which are recognized with sensed geometric information as 2D or 3D Cartesian positions. Current version of standard defines the following geometric patterns: Point, Line, Rect., Arc and Circle.
  2. Instead of speaking or writing words, simple well-known gestures help to communicate with others. For instance, V sign and Rock sign, which have well-known common semantics, are already used in various circumstances. Therefore, the standard provides a container format for symbolic patterns and a classification scheme to enlist well-known symbolic shapes.
  3. Many applications adopt well-known touch patterns for users who use touch-based interaction devices. In this part, a container format for well-known touch patterns and a classification scheme to enlist basic touch patterns are provided. Also, these touch patterns can be captured via not only touch-based interaction devices but also other intelligent devices.
  4. This part also describes the hand posture patterns to support the intuitive hand based interaction for scene description. For example, if a user wants to control an object in a scene description, the user in the real life makes a hand posture such as grabbing, fist, and open palm which would be a good candidate gesture to support such an interaction modality