The Moving Picture Experts Group

Screen Content Coding Comes Out Of The Shadows

 

By Philip Merrill - May 2016

The addition of Screen Content Coding (SCC) to HEVC's toolkit addresses the world of graphics, text, and animation directly, adjusting to this kind of content's unique properties that call for special treatment. It has long been a truism that compression quality is content-dependent, and for many years experts hoped to tackle these challenges. In addition to the flexibility that was designed into HEVC generally, these hopes were also kept in mind throughout its development and contributed to the ease with which SCC solutions now extend the HEVC framework.

Camera-captured content inevitably took first place in MPEG compression of moving pictures. This has enabled video to become part of the new normal on displays that otherwise exhibit screen content, for example a computer desktop or a social media app's message feed. SCC flips the script in the sense that it facilitates the inclusion of what was commonly the picture's frame or context within the video picture itself. In simple terms, instead of enabling a video box within a computer screen, HEVC-SCC enables higher quality screen content graphics to appear within the video box itself. In addition to making HEVC's toolset more complete, Screen Content Coding arrives in timely fashion to support the proliferating need for device-generated graphics to fit within our increasingly video-mediated lifestyles.

Requirements for screen content coding were formalized early in 2014 and include just over a dozen application areas, some of which had already been addressed by HEVC and all of which demand high quality. High-res display walls in control rooms or integrated medical information environments in surgeons' operating rooms lend a large-scale sense of drama to the task of making decisions based on rich datasets. Lives and livelihoods are often on the line as graphics and text communicate data vital for developing insights and making decisions. Automotive/navigation displays exemplify the role of screen content in an environment that combines comfort with risk and nurtures relaxed yet alert concentration. For another scenario, gaming suggests play and frivolous pursuits but successful titles commonly break new ground in providing players with intense data loads from which to make split-second decisions. Whether collaborating in games or in conference rooms, our social experiences are increasingly mediated not just by video but also by data. Interface designers might be amused by old-fashioned analog gages and counters that automated the operation of equipment rooms and factory floors, but the needs themselves are no joke. Emerging Internet of Things applications speak to our changing relationship with the real world, where new arrays of sensors — at home, at work, or placed across vast ecosystems — use lines, labels and RGB solid colors to communicate mission-critical information. In the course of a day many users now switch between different categories of control dashboards. While visual compression techniques suited for camera-captured content can be used and have been used for screen content, all these applications were placed at a disadvantage by being forced to rely on processing techniques better suited to showing a forest and its trees rather than displaying the word "tree" set against a solid-color background.

The various tool options in HEVC-SCC illustrate how the differences between screen content and the real world have been leveraged for higher quality compression. For example computer screens use a pixel grid with colors represented by RGB values. The implications of this enabled SCC to take advantage of these properties so screen content could be represented with fewer bits. Regarding pixel alignment, adaptive motion vector resolution relies on the pixel grid's integer values as sufficient. Motion between frames of camera-captured content is likely to involve fractional values between pixels, but computer-generated graphics and text are generally built or rendered based on integer pixel values determining where content is supposed to go. In SCC, a slice-level flag signals when these integer positional values are sufficient, thus saving the bits that are not needed for fractions. Testing for this relative advantage occurs during encoding and imposes some overhead but can result in greater compression.

As for color representation, it is not surprising that using RGB can sometimes result in saved bits. This is particularly true for residual component color values — what is left over after a predicted component color sample is subtracted from a new sample to be encoded. Adaptive color transform tests the relative benefits and can then use either RGB or a conversion format when advantageous for picture regions.

Palette mode introduces the option of indexed color. A logo with only three or so colors would benefit from this approach, which exemplifies the difference between simple man-made art and complex camera-captured samples. On the index itself, counting from zero lets the number after the final number of the indexes colors escape — signaling not to use the index for that content. Redundancies abound and provide opportunities for run length compression.

Intra block copy is a third mode beside intra and inter. IBC builds a visual dictionary of redundant blocks, a process which once again can be escaped for non-redundant content. This has been described as motion compensation within the intra picture. SCC encoding tests and derives optimally large blocks to sum up what can be large amounts of screen real estate into compact coding units. As a new mode it is somewhat analogous to an escape hatch within HEVC that signals the decoder to depart from processes more appropriate for camera-captured content.

For pictures that are either graphics and text or mixed content, SCC tools roughly double the efficiency of previous HEVC encoding with extensions. Reference software and an encoder test model tutorial have been repeatedly improved, allowing general testing and hands-on interaction.

Advertisements, cartoons, and boardroom presentations are filled with flying shapes and bright colors — graphics, text, and animations of diverse types — with the commonality that they are screen content. Optimizing the minimal representation of content like this outside of the conventions for camera-captured video could influence the creation of new content because higher quality and lower bit rate are better supported.

Go to the MPEG news page