INTERNATIONAL ORGANISATION FOR STANDARDISATION
ORGANISATION INTERNATIONALE DE NORMALISATION
ISO/IEC JTC1/SC29/WG11
CODING OF MOVING PICTURES AND AUDIO
ISO/IEC JTC1/SC29/WG11/N9995
July 2008, Hannover, Germany
Source: |
AHG on New Challenges in Video Coding Standardization |
Title: |
Busan Workshop on New Challenges in Video Coding Standardization – Program |
Video compression has been a very active area of defining standards over the last 30 years. To face the challenges that emerging applications impose on the requirements of video coding standardization, ISO/IEC WG11 (MPEG) will hold a full-day workshop on 14 October 2008, during the 86th WG11 meeting in Korea.
The key intention of the workshop is to acquire solid information about the context in which video coding will be operating in the future, which will enable MPEG to draw conclusions for the needs and chances in video coding standardization during the next years and to start drafting three key documents: technology context, applications and requirements for a new High-Performance Video Coding (HVC) standard. For this purpose speakers have been invited on key topics for the morning sessions, and in addition regular proposed contributions were accepted for the noon and afternoon sessions.
The Workshop will be held on 14 October 2008 from 9:00-18:00 at Crystal Ballroom #3, 3rd Floor, Busan Lotte Hotel, 503-15 Bujeon-Dong Pusanjin-Gu Busan, Korea 614-030.
Participants who are not regularly attending the 86th MPEG meeting should register by sending an email to Sungwook Jung ( swjung@kisi.or.kr ) with subject line "Registration for Busan video coding workshop" and including contact data in the mail body (name/title, company/affiliation, address/phone/fax/email).
Detailed Program
9:00-9:10 Welcome and Introduction (Leonardo Chiariglione)
Invited Session 1: Video Coding and Next-Generation Networks
(Chair: Jens-Rainer Ohm)
9:10-9:40 Tomonori Aoyama (Keio University):
Direction of digital media and content evolution and a new generation network to support it
9:40-10:10 Jeongyeon Lim (SK Telecom), Simon Ji (LG Electronics), Taesung Park
and Daesung Cho (Samsung Electronics), Jae-Seob Shin (Pixtree) :
Experiences and forecasts on mobile video services by manufacturers and operators
10:10-10:40 Doug Y. Suh (KHU), Won Ryu and Jeong Joo Yoo (ETRI):
MPEG-64 (MPEG over IPv6 and 4G networks)
10:40-11:00 Coffee Break
Invited Session 2: Video Coding for Future Applications and Devices
(Chair: Jörn Ostermann)
11:00-11:30 Seonki Kim (Samsung):
Advanced Technology in LCD Display –
New Driving Scheme and Advanced Super PVA Technology
11:30-12:00 Jonghwa Kim (Samsung):
Flash Memory for Packaged Media : What it can do and where it fits
Regular Session 1: Technology Context of Future Video Coding
(Chair: Ajay Luthra)
12:00-12:30 Euee S. Jang (Hanyang University):
Reconfigurable Video Coding – A Building Block for Future MPEG Coding Standards
12:30-12:50 Kim Kyunghoon, Kim Nacwoo, Kim Sangkyune, Son Seungchul
and Lee Byungtak (ETRI):
The necessity of a New MPEG Standard Supporting
Real-time Distributed IPTV Environment
12:50-14:10 Lunch Break
Regular Session 2: Compression Technology
(Chair: T.K. Tan)
14:10-14:30 Geert Van der Auwera and Yeong Taeg Kim
(Samsung Information Systems):
Triangular Sub-Macroblock Partitioning for Motion Compensated Prediction
14:30-14:50 Munchurl Kim (ICU), Changseob Park (KBS):
Beyond Macroblock based Predictive Coding
14:50-15:10 Kyohyuk Lee, Elena Alshina, Jeonghoon Park, Woojin Han
and Junghye Min (Samsung):
Technical considerations on new challenges in video coding standardization
15:10-15:30 Johannes Ballé, Steffen Kamp, Aleksandar Stojanovic, Mathias Wien
and Jens-Rainer Ohm (RWTH Aachen University):
Tools for Improving Texture and Motion Compression
15:30-16:00 Coffee Break
16:00 Open Discussion and Conclusions
18:00 End of Workshop
Abstracts
Tomonori Aoyama (Keio University): Direction of digital media and content evolution and a new generation network to support it
Jeongyeon Lim (SK Telecom), Simon Ji (LG Electronics), Taesung Park and Daesun Cho (Samsung Electronics), Jae-Seob Shin (Pixtree) : Experiences and forecasts on mobile video services by manufacturers and operators
Korea is one of the leading countries in deploying broadband wireless networks such as 3.5G HSPA and WiBro (IEEE802.16, mobile WIMAX) over the mobile terminals. In this document, we describe our experiences on mobile services in service, terminal and network aspects and conclude the requirement of a new challenge.
1. Experiences during 2002~2008
Since the first wireless video service in 2002 over cdma 1x EV-DO, it is getting popular with evolution of the wireless networks.
1.1 Video telephony
With the evolution of the wireless network, it is available to serve various multimedia services on mobile. Video telephony is on service in WCDMA and cdma 1x EV-DO network. However, it is difficult to get appropriate QoS in the wireless network. Generally, it is known that it is available to use 2Mbps data rate in WCDMA. However, the wireless network condition is much worse than the wired network. The numerical value in the specification means the total maximum available date rate in a cell. Average data rate is much lower than the theoretical maximum rate. This data rate can be achieved in good radio condition. Also, it is difficult to extend the bitrate for a user in a cell because of the expensive cost of equipment. The video telephony service in SK Telecom is on service over WCDMA with QCIF size, maximum 64kbps (average 48kbps) and 7~10 fps by H.263. It is difficult to satisfy the video quality in video telephony.
1.2 Video streaming
VoD and MoD is on service over HSPA and EV-DO in Korea. Like the WCDMA case, it is mentioned in the specification that HSPA can support the maximum 14.4Mbps. However, the numerical value in the specification is much different to the available data rate in the real wireless network as mentioned above. SK Telecom is providing H.263 and MPEG-4 AVC|H.264 based VoD service with QCIF and QVGA sizes, 190~210 kbps and about less than 15 fps. Also, it is difficult to satisfy the video quality in VoD service, especially in scenes with many and fast changes.
The video streaming over WiBro being tested is on MPEG-4 Visual based video service with less than 128kbps bandwidth, 8~10 fps, and QCIF video size. Usually it looks fine, but some cases like when a mobile terminal moves to edge of cell or gets into shadow area or does handover show bad video quality. To overcome this bad situation, cross layer approach and adaptive bit rate control can be good methods, and it shows improved PSNR as a result when they are applied.
1.3 QoS handling
Because of fading, congestion, and hand-over, in order to avoid continual freezing periods of any video services, video codec should be capable of rate control as well as loss-resilient. Since loss protection in lower layer (e.g. MAC layer) induces uncomfortable delay in video telephony service over WiBro (IEEE802.16) and 5~6s long channel zapping delay in DMB (Digital Multimedia Broadcasting).
2. Forecast 2009~2015
2.1 Technology push
2.1.1 Evolution in wireless network
It is expected that the next wireless network will be all-IP and broadband network. However, it is also expected that the wireless network condition can be still poor, and the theoretical maximum data rate in the evolved network may have great difference to the average date rate. In the packet network, the data loss and delay weaken the audio quality, and a refresh rate and zapping time as well as loss and delay should be considered in the video quality.
2.1.2 Evolution in mobile terminals
The current mobile handset with full browsing LCD can support WVGA size; however, it is difficult to extend LCD size to more than 3 inches because of the portability. In order to overcome the size limitation, the mobile handset with a projector was deployed. This mobile handset can support VGA video with 30 inches and play it during 2 hours. HMD (head mount display) can play 60-inch video by 20~30% electronic power of a projector. Therefore, it is expected that the physical and electronic power limitation for high-resolution video will be overcome sufficiently.
In the codec aspect, the video decoding in a mobile handset doesn’t matter because the mobile handset mainly focuses on video playing. It is expected that AVC D1 decoder will be available in the mobile handset. However, it is difficult that the video encoding supports good quality of video because of the encoder complexity. Therefore, a new codec is needed to improve the encoding complexity.
3. Conclusion
3.1 High quality video services
Technology push to high quality video services is satisfactory in terms of display technology, CPU and digital HW technology, battery technology while market pull seems very weak even in Korea. Market pull depends on how much people get used to video services and how much mobile service cost gets down.
3.2 New challenges in video coding
In order to catch dynamics of wireless channel, it is desirable to use prompt MAC parameters of wireless channel. Since mobile terminals are very limited in resources such as spectrum, CPU, and battery, convergence of layers is required for optimal resource handling. JSCC(Joint Source Channel Coding) as in www.ist-pheonix.org can be a promising approach in the sense.
Since people gets more active for producing video than consuming video, one of challenges in video coding is how to reduce encoder complexity. Video codec and mobile QoS technology should concern about realtime video uploading services.
Acknowledgement
This abstract has been reviewed by the members of qoMVAS (quality of Mobile Video Applications and Services) Consortium.
Doug Y. Suh (KHU), Won Ryu and Jeong Joo Yoo (ETRI): MPEG-64 (MPEG over IPv6 and 4G networks)
Next generation network will get wider in bandwidth and smarter in QoS (Quality of Service) protocols. This contribution focuses on evolution of smartness of networks. In 2015, we will see that IPv6 and wireless networks beyond 3G are widely deployed. They were developed under the assumption that the future network should support QoS of realtime multimedia services over all-IP networks. The current video coding standards should be modified to exploit the merits of the QoS tools.
1. Per-class QoS protocols and video coding : Per-class service is adopted in diffServ (differential service) of IETF, IEEE802.11e, 3GPP UMTS(Universal Mobile Telecommunications System) and so on. Packets are classified into 3~4 different priorities in per-class service. Network entities in the middle of the networks handle each packet according to its priority. Priority may include delay priority and loss priority. In Korea in 2008, Korea Telecom has launched diffServ for its IPTV service over so called "Premium" network.
Video coding tools should be selected according to service categories such as delay-sensitive video telephony, quality-sensitive video-on-demand, and so on. Their service requirements are represented by a set of three parameters [bitrate, packet loss ratio, end-to-end delay]. These three parameters show trade-off relationship which should be taken into consideration for selection of optimal video coding tools and parameters.
2. Per-flow QoS protocols and video coding : Per-flow service is adopted in intServ/RSVP (integrated service/ReSource reserVation Protocol) of IETF, IEEE802.16 (WIMAX in USA or WiBRO in Korea), 3G UMTS, and so on. For each flow, the terminal reserves appropriate network resource. (For example, a video streaming service may be treated as a flow of multiplexed audio and video bitstreams, or two flows of an audio flow and a video flow, separately.) Network resource can be represented a set of parameters, so called tspec. For optimal operation within tspec parameters, video codecs should be modified to be capable of adaptive control in bitrate, loss-resiliency, and buffer simultaneously. In 2008, WiBRO(IEEE802.16) of Korea Telecom becomes equipped with rt-PS(realtime Polling Service), which uses tspec for resource reservation.
3. MANE(Media Aware Network Entity) and IPv6 : JVT/SVC assumes existence of MANE in the middle of networks for extraction of video packets according to channel condition and user preference. For identification of video packets, it assumes that the MANE reads the NAL header, which is merely part of payload data. It is not realistic assumption since the routers mostly read only the IP header. In the IP header, there is a 24-bit long flow label in the IPv6 header for identification of packet of a certain flow. Usage of flow label is not well-standardized yet and it is recommended to map the NAL header of MPEG/JVT to flow label. For per-class service, NAL header may be mapped to DSCP diffentiated Services Code Point) of diffServ. These two kinds of mapping could be performed in the so-called Adaptation Decision Taking Engine(ADTE).
Conclusion
In order to exploit limited network resources more efficiently, it is needed to harmonize the network QoS standards and video coding standards, logically as well as quantitatively. This task cannot be done by only network specialists, but also by video specialists since video specialists are familiar with quality assessment of video services which will make major traffic in the future all-IP network.
Acknowledgement
This abstract has been reviewed by the members of qoMVAS (quality of Mobile Video Applications and Services) Consortium.
Seonki Kim (Samsung): Advanced Technology in LCD Display – New Driving Scheme and Advanced Super PVA Technology
Technology in LCD displays is changing very rapidly. Ultra-High Definition (UD) 120Hz-driving and Full-High Definition (FHD) 240Hz-driving LCD TV sets have already been introduced at CES and IFA show this year. Overcoming tough technical challenges has made this innovation possible. The presentation will account for basic technologies in LCD displays. Tough technical issues for next-generation LCDs such as UD 120Hz LCDs and FHD 240Hz LCDs will be explained to audience of further interest.
Contents:
I. Key Factors in Performance
- Performance Innovation
- Viewing Performance
- Higher Resolution & Larger Displays
- Display Resolution a& Picture Quality
- Motion Blur Performance Comparison
- High Speed Driving Concept
- Toward Perfect Moving Picture Images
II. Advancements of S-PVA Technology
- Basic Principle of S-PVA Technology
- Two-Transistor S-PVA
- Limitations of Large Size UD LCDs
- 1G-2D vs. hG-2D S-PVA Cell
III. Conclusion
Jonghwa Kim (Samsung): Flash Memory for Packaged Media : What it can do and where it fits
The developments of Flash technology have been phenomenal. Thanks to the rapid drop in bit cost and wide popularity of Flash cards, it is now considered as packaged media for contents in high capacity. But it also has limitations when compared with optical media. This presentation will discuss the pros and cons of Flash memory for next generation packaged media. Key technical aspects such as write throughput and bit cost projection of Flash memory and the right-fit application scenarios and surrounding requirements will be also discussed.
Euee S. Yang (Hanyang University): Reconfigurable Video Coding – A Building Block for Future MPEG Coding Standards
In this presentation, we highlight what have been developed through MPEG RVC standardization. RVC is a framework that allows the modular design of codec with decoder description and tools (FUs). Observing the trend of developing MPEG coding standards, we anticipate that future video coding standards will be based on the ‘framing’ of existing standards: intra/inter prediction, transformation, and entropy coding. And it would not be a surprise if many existing tools of the current standards are reused, modified, and/or extended. From the standard development point of view, the development of a new coding standard would take many years to complete. Core experiments to evaluate the proposed tools and verify the effectiveness of adopted tools with an existing model (e.g., JM or VM) have been one of the most critical and time-consuming processes. The RVC framework will substantially reduce the time and energy to test new tools and maintain the reference model thanks to the inherent modular approach. In this presentation, we shortly describe a design process of a new codec with two cases: 1) simple modification of existing tool and 2) replacement of the existing tool with a new tool. Throughout the study, simplicity and clarity of RVC framework as a basic building block of new codec design are shown.
Kim Kyunghoon, Kim Nacwoo, Kim Sangkyune, Son Seungchul and Lee Byungtak (ETRI): The necessity of a New MPEG Standard Supporting Real-time Distributed IPTV Environment
We, at ETRI, are designing and developing a distributed media platform which can efficiently provide program provider, group and individual with MPEG2 or AVC compressed streaming based on MPEG2-TS format using overlay routing.
The existing IPTV businesses are most likely to evolve from initial service scheme which offers media service based on the closed network and multicast resource into an open service scheme including personal broadcasting.
The distributed media transmission platforms based on P2P and overlay multicast technologies seem to be continuous issues of focus because of restriction of multicast resources and prolonged difficulties. Thus, the new media standard must be considered to cope with aforementioned technologies (P2P and Overlay multicast). They include the followings.
Considerations on relative delay incurred from media relay in overlay multicast based on unicast
Considerations on media level routing information in media oriented network
Efficient MPEG2-TS system format for MPEG2/AVC and distributed relay environment.
To serve high quality media (SD/8Mbps, HD/20Mbps, UD/~over 100Mbps) over internet has been a new challenge and we are eagerly expecting to improve the performance limitations, which can not be solved using distribution systems, with new media technologies.
In contribution to this, we introduce a distributed real-time media platform and describe the requirements of the next generation MPEG standard by issuing media bandwidth occupation, transmission delay incurred from relay and MPEG2-TS multiplexing related solutions/problems.
Geert Van der Auwera and Yeong Taeg Kim (Samsung Information Systems): Triangular Sub-Macroblock Partitioning for Motion Compensated Prediction
H.264/MPEG-4 Part 10 has improved the rate-distortion efficiency of video standards such as the pervasive MPEG-2 standard. One of the encoding tools at the base of this improvement is the variable block sizes tool for motion compensated prediction. With this tool inter-coded macroblocks can be partitioned into rectangular shapes of sizes 16×8, 8×16, or 8×8 pixels. In addition, the 8×8 sub-macroblock may be further partitioned into 8×4, 4×8 or 4×4 blocks. This partitioning is primarily used when an object’s edge is present in the interior of the 16×16 macroblock and the object has a motion that is substantially different from the background object. The result is the motion based segmentation of the macroblock. Partitioning may also occur when the object deforms or when its motion is not well represented by the translational motion model.
Video encoding experiments demonstrate that sub-macroblock partitioning is mainly used when the bit rate, or equivalently the video quality, is high. Therefore, we explore additional sub-macroblock partitions that may improve the rate-distortion efficiency of H.264 further for high video quality applications. Rectangular partitions are sufficient for motion segmentation when the object’s edge is approximately horizontal or vertical. When the edge is closer to diagonal, the rectangular partitioning tends to oversegment the macroblock to accommodate the diagonal edge. Therefore, we propose additional triangular sub-macroblock partitions.
Encoding results obtained with standard CIF resolution sequences indicate bit rate savings in between 0.42% and 0.82% compared to standard H.264 with only rectangular partitioning. These results are promising and research is ongoing to increase the bit rate savings further. One of the observations is that the total number of sub-macroblock partitions is larger than for standard H.264, which results in a larger number of bits spent on representing motion information, however, far less bits are spent on representing the prediction error (up to 4% bit rate reduction).
High quality video encoding is important because of the availability of very high quality displays and the increasing familiarity of the public with high definition video. Algorithms such as triangular sub-macroblock partitioning can achieve higher video quality at bit rates that are presently offered by H.264.
Munchurl Kim (ICU), Changseob Park (KBS): Beyond Macroblock based Predictive Coding
More than two decades have been passed in video coding standards based on macroblock (MB) predictive coding.
The fixed size of 16x16 in MB has lived well along in video coding for vide sequences of small and medium sizes up to HDTV applications. 8x8 DCT and even 4x4x DCT are used to do transform coding. In this sense, the current video coding technologies tend to be more optimized towards the video coding with smaller sizes. However, it is known that the fixed MB size limits the coding efficiency, especially for high spatial resolution video coding although it has been justified as a compromise between coding efficiency and computational complexity.
Recently there are increasing needs for Ultra High Definition TV applications which look forward Home D-cinema and realistic video contents in future. 4K cameras came out to markets and even 8K video cameras are expected to be out in markets in near future. Accommodation of such ultra HD video can be possible in broadband networks and storage media and the 2-D display devices to support the ultra HD video can be manufactured by the current technologies.
In this regard, it is a moment to reconsider the necessity of enlarged MB based predictive coding for beyond HD video applications. In this talk, we present and discuss about the performance of H.264|MPEG-4 AVC based on an extended JM reference software with enlarged MB based predictive coding in sizes of 32x32 and 64x64. The performance will be shown in terms of PSNR values and mode selection statistics according to different video sizes.
Elena Alshina (Samsung): Technical considerations on new challenges in video coding standardization
For last 30 years, video coding technology has achieved noticeable coding efficiency improvements along with the releases of international video coding standards such as MPEG-1, MPEG-2 and MPEG-4 AVC. Consequently, tremendous applications have been created in media industry based on those international video coding standards: high definition TV, digital versatile broadcasting, digital, digital video disc, blue ray disc, video streaming and etc.
Recently, many video coding experts in MPEG/VCEG are trying to progress video coding technology for further improvement on coding efficiency and generation of new applications. In this contribution, we would like to consider recent challenges on video coding in MPEG/VCEG and especially on Samsung’s investigation for future video coding.
Contents:
I. Conventional video coding standards
II. Recent challenges in video coding technology
III. Samsung’s investigation on future video coding
IV. Conclusion
Johannes Ballé, Steffen Kamp, Aleksandar Stojanovic, Mathias Wien and Jens-Rainer Ohm (RWTH Aachen University): Tools for Improving Texture and Motion Compression
This contribution reports about different new tools that are suitable to improve the compression efficiency in video coding, targeting different components for which saving of data rate could be valuable:
– Displacement intra prediction (DIP) and Markovian texture prediction (MTP) for improved intra coding;
– Decoder-side motion vector derivation (DMVD) based on template prediction for improved inter coding;
– Dynamic texture synthesis (DTS) for high-efficiency compression of moving textures with high amount of detail and irregular motion.
All three methods have currently been implemented standalone (no combination) within the AVC reference software. Improved compression (e.g. for the DMVD 9% on average over a whole set of sequences and rates, and up to 30% for some cases) was found. Compression gain is achieved over the entire range of resolutions from CIF to HD, and over a wide range of data rates.
Even though the original approaches as listed above would increase the decoder complexity considerably, it has been found that low-complex algorithms can be designed that provide similar performance.