PARIS, France
Palais des Arts et des Congrès d'Issy
25
Avenue Victor CRESSON
92130 ISSY les MOULINEAUX
AE
(PACI) Site Coordinator: MR. Patrick Foucteau
13:00
- 18:00
Saturday,
October 28th, 2000
Presentations:
MOLIERE ROOM
Exhibition:
FOYER DEBUSSY
Chair: Neil Day
Digital Garage, Inc., Japan neil@garage.co.jp
|
Co-Chair:
Eric Rehm Singingfish.com,
USA
|
http://www.cselt.it/mpeg/events/mpeg-7ae/
The
XML Journal Issues sponsored by SYS-CON Media, Inc
http://www.sys-con.com/xml/
1:00-1:05
Welcome
Multimedia
1:15-1:25
00M7M002: MPEG-7 Visual Annotation Tool
1:25-1:45
00M7V008: Hierarchical Summary Browser
00M7V009: Table of Contents (ToC)
Browser
00M7V010: SmartEye
1:45-2:00
00M7M004: Internet Streaming Media Metadata Interchange using MPEG-7
2:00-2:15
00M7M005: The MPEG-7 Experimental Model (XM)
2:15-2:20
Additional
Q/A
Audio
2:20-2:35
00M7A001: Spoken Content
2:35-2:50
00M7A002: CUIDADO
2:50-3:05
00M7A003: Music Retrieval by Melodic Query
3:05-3:10
Additional Q/A
3:10-3:25
Break
Visual
3:25-3:40
00M7V001: Search Engine Tool
3:40-3:55
00M7V002: Video Editing DS
3:55-4:10
00M7V003: MPEG-7 Video Browser and Highlight Generation Tool
4:10-4:25
00M7V004: Video-over-IP (VIP)
4:25-4:40
00M7V005: Edge histogram descriptor
4:40-4:55
00M7V006: Video
Annotation and Summaries
4:55-5:10
00M7V007: MPEG-7 Video Object Segmentation and Retrieval
5:10-5:25
00M7M003: Wireless Images Retrieval using Speech Dialogue Agent
5:25-5:40
00M7V011: MPEG-7 Camera
5:40-5:45
Additional Q/A
Panel
Session
5:45-5:50
00MP7IFG001:
MPEG-7
in the 21st. Century - Short Introduction
5:50-6:15
Discussion
6:15- Thanks and Close
Session Chair: Mr. John Smith, IBM, USA.
The MPEG-7 Conceptual Model provides a
model of the audio-visual domain at the
conceptual level, which is independent of the design and implementation
of the MPEG-7 Description Schemes and Descriptors. The MPEG-7 Conceptual Model
defines each principal concept in words and employs modeling
constructs of entities, relationships, and attributes in modeling the
audio-visual content description concepts.
In this talk, we describe the role of the MPEG-7 Conceptual Model in
creating the MPEG-7 Standard and examine its
potential use in tools for creating MPEG-7 Descriptions.
The MPEG-7 Visual Annotation tool
enables users to interactively create MPEG-7
descriptions using MPEG-7 Description Schemes and Descriptors. The tool takes as
input an MPEG-7 Schema definition file and an MPEG-7 package description file.
The MPEG-7 Schema defines the structure of
the MPEG-7 description components using the MPEG-7 Description Definition
Language (DDL). The Package
description organizes the MPEG-7 description components in order to improve the
ease of navigation in the MPEG-7 Visual
Annotation Tool. The tool provides
utilities for drag-and-drop
copying and re-using of description elements and allows the output
of the descriptions in XML to files. The
initial implementation centers
around manual entry of description data, however, in future work we plan
to explore the integration of automatic and semi-automatic feature extraction
methods with the goal of providing a complete system for MPEG-7 multimedia
content annotation and query building.
Contact:
John Smith,
Email:
jrsmith@watson.ibm.com
Manager,
Pervasive Media Management
IBM
T. J. Watson Research Center, 30 Saw Mill River Road, Hawthorne, NY 10532
(914)
784-7320;
The agent in the client terminal
recognize user's utterance in English/Japanese with rather dedicated sentences
and send a query profile to the server using a wireless tranceiver channel
(32kbps). The server will retrieve
the requested images and deliver the compressed video bitstream (H.263) to the
client. Then the client agent will reply with a synthesized voice and display
the images. Now the original format of the metadata is being used, but the
MPEG-7 format would be used in the near future for all the clients, servers, and
channels.
Contact:
Mikio Sasaki:
Email:
msasaki@rlab.denso.co.jp
Research
Laboratories, DENSO CORPORATION
500-1 Minamiyama, Komenoki-cho, Nisshin-shi, Aich-ken,470-0111 Japan
Singingfish.com uses MPEG-7 description schemes to model the Internet streaming media metadata. This presentation describes our use of MPEG-7 description schemes to define a schema for the XML interchange of Internet streaming media metadata with several of our commercial content partners.
The
goal of such metadata interchange is to populate our search index with the
highest quality and most semantically rich metadata possible, ultimately
yielding superior relevance to the end user.
The
presentation includes a short demonstration of the fidelity of a transformation
from MSNBC's "Partner XML Format" to an MPEG-7 XML description.
Contact:
Eric Rehm:
Email:
rehm@singingfish.com
Singingfish.com
/ Thomson Multimedia, Seattle, WA, USA.
00M7M005:
The MPEG-7 Experimental Model (XM)
This presentation covers:
1. The Basic structure of the MPEG-7 XM Software
2. A Graphical User Interface for the MPEG-7 XM Software
3. Key Applications for the MPEG-7 XM
a) Search and retrieval
b) Transcoding
4. Combining visual low level descriptors in a search
application
Affiliation:
Munich University of Technology,
Munich, Germany
Session Chair:
Mr.
Vincent Puig, IRCAM, France
"At
the awareness event we will present the Spoken Content description scheme, along
with a basic Web application to
illustrate
the concept and its applications."
Canon
Research Centre Europe (CRE), [along with our collaborators at IBM (Almaden)]
have proposed the MPEG-7 Spoken Content description scheme. Searching and
indexing audio-visual data using the speech in the sound track is, perhaps, one
of the most natural metadata retrievals and our metadata format is especially
designed to store the (sometimes erroneous) output of a speech recognition
system in a manner most suited to robust retrieval. We are performing research in a large range of potential
applications of such data using textual and/or verbal querying.
Email: wilsonc@cre.canon.co.uk
Email:
philg@cre.canon.co.uk
Canon Research Centre Europe Ltd
Content-based
retrieval of Music and Audio samples
Information
overload, inability to quickly browse through audio, poor added-value to music
via Internet distribution, keyword dictatorship, inability to search for
similarities among sounds : these are music consumer complaints addressed by
IRCAM’s CUIDADO project. It aims at developing content-based technologies
using and contributing to the MPEG 7 standard. Building reusable modules for
audio feature extraction, music indexing, database management, networking and
constraint based navigation, CUIDADO targets two pilot applications:
1) The
Music Browser features musical paths and automatic compilations according to
user’s tastes, search for music similarities, learning systems based on
user’s profiles. One version is tied to Web music
monitoring and another to Web music sales and customized radios.
2) The
Sound Palette involves musicians and studios for developing an authoring tool
both online and in an existing professional audio environment taking full
advantage of the extracted audio features for innovative retrieval, editing and
processing.
CUIDADO
is expected to bring Studio Online to a mature stage based on MPEG7 standard.
High impact on Music providers and
labels involved in Web distribution is expected. Assuming that music value is
currently decreasing in itself, this application should give an evidence that
new services and interfaces for accessing music and sounds may bring more value
than the music itself in the future context of Electronic Music Distribution
(EMD). This project should also raise copyright societies and music labels
awareness on their role in using new content-based tools for music promotion and
music protection.
Contact:
Vincent Puig (Managing Director),
Email:
Vincent.Puig@ircam.fr
IRCAM, 1 place Igor Stravinsky, 75004 Paris.
00M7A003:
Music Retrieval by Melodic Query
Identifying a musical work from a
melodic fragment is a task that most people are able to accomplish with relative
ease. For some time now researchers have worked to give computers this ability
as well, which has come to be known as the "query-by-humming" problem.
To accomplish this, it is reasonable to study how humans are able to perform
this task, and to assess what features we use to determine melodic similarity.
Research has shown that melodic contour is an important feature in determining
melodic similarity, but it is also clear that rhythmic information is important
as well. The system to be demonstrated uses our proposed MPEG-7 description
scheme for melody, which incorporates melodic contour and rhythmic information
as the primary representation for music search and retrieval.
Additional front-end processing (to
process queries), a medium-sized database of music, and a search engine (for
finding appropriate matches) have also been implemented to complete the full
query-by-humming system.
Contact:
Youngmoo Kim,
Email:
moo@media.mit.edu
Machine
Listening Group, MIT Media Lab., Boston, USA
http://sound.media.mit.edu/~moo
VISUAL
Session
Chair:
Munchurl
Kim, Ph. D., Electronics and Telecommunications Research Institute, Korea.
On the upcoming awareness meeting, an
application will be presented, that allows visualization of similarity-based
retrieval results. This so-called Search Engine was applied for Core Experiments
of visual descriptors. A graphical user interface is used for a number of
functionalities, e.g.
-
Browsing of image databases
-
Visualization of 3D data and image
sequences
-
Similarity Search for a number of
visual descriptors
The SearchEngine is a Java-based
application, that incorporates underlying functionality of C- or C++-based
extraction and similarity matching algorithms. For sequence playback, an
MPEG-player is included. A small 3D viewer was also added in Java3D technology.
For comparable results within the MPEG-7 Core Experiments for visual descriptors
a console application, called MPEG-7 XM was used among the participants. This
XM-Software is also integrated into the SearchEngine. Certain basic image
features are analyzed for similarity-based retrieval by this GUI:
-
Texture
-
Color
-
Contour/Shape
-
3D Geometry by analyzing a number of 2D
projections from 3D object
- Different motion in Sequences (e.g. background motion from left to right)
Contact:
Karsten Müller,
Email: kmueller@hhi.de
Heinrich
Hertz Institute
Einsteinufer
37, 10587 Berlin, Germany
Tel: +49 30 31002 225, Fax: +49 30 392
7200
00M7V002:
Video Editing DS
To highlight the basic elements of the Video Editing DS, two applications have been developed to edit and browse the description of the video temporal structure specified in the MPEG-7 format. This temporal structure describes various types of temporal units : shots, rushes and composition segments. The way these units are edited is also described in terms of transition or composition effects.
The browser offers some navigation
functionalities to quickly access specific parts of a video document regarding
the way it has been built.
The editor allows the completion of a
partial description of a video structure that could have been provided by a
video-to-shot segmentation algorithm.
Contact:
Rosa Ruiloba, Philippe Joly
Email: rosa.ruiloba@lip6.fr,
Philippe.Joly@lip6.fr
Indexation Multimedia
Laboratoire
d'informatique de Paris 6 - LIP-6/UPMC
Bureau
C1219 tel : (33).(0)1.44.27.88.48
8,
rue du Capitaine Scott 75015 Paris
As in the case of abstracts describing
papers in the classical sense, a video summary is an ‘audiovisual’ abstract
of a video program, which allows for quick understanding of the underlying story
of the program. We can capture the whole story by glancing over the summary. The
structure of the summary description is hierarchical so that coarse-to-fine
navigation is possible in order to access more detailed information (contents).
Furthermore the MPEG-7 summary structure allows for an event-based summary with
which customized browsing and filtering is possible on the summary.
1. Video Summary Generator
A video summary generator creates video
summaries of highlights automatically and/or semi-automatically, using low level
audiovisual features and high level semantics, assisted by content analysis and
highlight detection rules, respectively. It outputs description data that
contain a set of highlights, composed of video summaries, that are derived from
the MPEG-7 Summarization DS (Description Scheme). The generated short video
highlight summaries can be used with an electronic program guide (EPG) or with a
video-browsing tool in personal storage devices. The Video Summary Generator
also generates a CC (closed-caption) text DB, which consists of keywords
extracted from CC text, using text analysis and time codes to indicate
‘keyword-sychronized’ video locations obtained by speech recognition in the
audio track, in order to support text-based retrieval of news video clips.
2. MPEG-7 Video Browser
The generated summary description data
is fed in to an MPEG-7 video browser. The MPEG-7 video browser allows for quick
overview utilizing, audiovisual highlights with different time durations,
efficient browsing through non-linear navigation (based on multi-level
hierarchical highlights and associated key-frames), and a ‘highlights-view’
and browser based on particular events. It also provides CC-text-based retrieval
of news video clips. The video browser can be used as a
video-browsing/-retrieval tool in personal storage devices in digital
broadcasting and internet environments
Contact: Munchurl Kim
Email:
mckim@etri.re.kr
Participants:
Munchurl Kim, Hyun Sung Chang
Affiliation:
Electronics and Telecommuncations Research Institute
Country:
Korea
Full streaming over the internet of
both content and MPEG-7 metadata.
The
Video-over-IP project (VIP) is an integration project carried out in the
Netherlands. Various partners are involved, like the Telematica Instituut, NOB,
SurfNet, IBM, and TNO. In general, the purpose of the VIP project is to allow
for the production, storage, management, retrieval, and exploration of video
content for a specific set of users. Moreover, these services should be
interoperable on the Internet. The following general activities should be
possible:
· The production of digitised video material (media objects), ready for distribution over the Internet
· The production of content (video material plus metadata), including the management of this production process
· Digitising video and other material in various formats
· Extending the video material with additional descriptions (metadata) for disclosure, either (semi-) automatically, or manually. In order to search in the content, parts of the video should be properly described.
· Indexing and retrieval of content
· End users should be able to search in the content
· Search, retrieval- en browsing facilities, including a user interface
· Security against improper use (encryption and watermarking)
· Distribution of high-quality video to the end user over the IP network
· The realisation of a network architecture needed for offering these services with a high quality of service
· Charging the end users on the basis of the delivered content and services (content-based billing & accounting).
Contact:
Erik Oltmans:
Email:
oltmans@telin.nl
Telematica
Instituut (www.telin.nl), The Netherlands
Short
Description: The edge histogram descriptor
represents local edge distribution on 4*4 sub-images. Five types of edges,
namely four directional edges and one non-directional edge, are defined for each
sub-image. So, there are a total 16*5=80 histogram bins.
Function
(in one sentence): Image to image matching,
especially for natural images with non-uniform edge
distribution.
Benefit
for Applications: Since the descriptor is based
on the edge information in the image, it is good for natural image matching.
Since edges play an important role for image perception, it can retrieve images
with similar semantic meaning.
Potential
Users:
-
Image search (retrieval) by example or
by sketch
-
Scene change detection
-
Key frame clustering
Contact: Soo-Jun Park
Email: psj@etri.re.kr
Senior Member of Engineering Staff
ETRI-CSTL
161 Kajong-dong, Yoosung, Taejon,
305-350, Korea
URL:
http://sir.etri.re.kr/~soop
(phone) +82-42-860-6899, (fax)
+82-42-860-4889
1. Video annotation editor
The system can automatically generate
video transcripts using speech recognition and make a correspondence between
video scenes and words. The system can also detect scene change boundaries. The
user of this system can modify automatically-generated transcripts and scene
boundaries. The user can also annotate some keywords and comments on objects in
video frames. The system generates XML-formatted annotation data that contains
all information created through user interaction.
2. Video player with summarization
function
The system can generate summaries of
video clips with annotation data and play them. The user can input any keyword
that will contribute to customization of video summaries. The player can also
show transcript text synchronized with video like closed caption. The user can
also select any scenes from the scene index window.
Contact:
Katashi Nagao, HASIDA Koiti:
Email;
KNAGAO@jp.ibm.com
IBM
Tokyo Research Laboratory
Email:
hasida@etl.go.jp
Director
of Information Science Division,
Electrotechnical
Laboratory, (ETL), Ibaraki, Japan.
We will present a video object segmentation system, AMOS, and a video retrieval and visualization application.
Currently, fully automatic segmentation
of semantic objects is only successful in constrained visual domains. The AMOS
system takes on a powerful approach in which automatic segmentation is
integrated with user input to track semantic objects in video sequences. For
general video sources, the system allows users to define an approximate object
boundary by using a tracing interface. Given the approximate object boundary,
the system automatically refines the boundary and tracks the movement of the
object in subsequent frames of the video. The system is robust enough to handle
many real world situations that are hard to model in existing approaches,
including complex objects, fast and intermittent motion, complicated
backgrounds, multiple moving objects, and partial occlusion. For each video
sequences, the description generated by this system is a set of semantic objects
with the associated regions and visual features that can be manually annotated
with text. Text annotations can also be assigned to the video sequence.
The video retrieval and visualization
application developed during a Core Experiment within MPEG-7 uses the
descriptions generated by AMOS to retrieve and visualize videos based on the
annotations and visual features. This application supports (1) query by example
based on any combination of visual features and text annotations (e.g., retrieve
video sequences with similar objects based on color and texture); (2) query by
keyword based on text annotations (e.g., retrieve video sequences with
“elephant”); and (3) advanced visualization of the retrieved results based
on panoramic views and segmented objects.
Contact: Ana Belen Benitez
Email: ana@ee.columbia.edu
Electrical
Engineering Department
Columbia
University, 1312 Mudd, #F6, 500 W. 120th St, MC 4712, New York, NY 10027
Voice:
+1 212 854-7473 Fax: +1 212
932-9421
URL: http://www.ee.columbia.edu/~ana/
00M7V008:
Hierarchical Summary Browser
Category:
Application of the Summary DS.
Features:
Summary Theme based Audio-Visual Summary Selection
Presentation Time based Audio-Visual Summary Selection
Abstract:
Hierarchical Summary Browser is based on the Summary DS which is in the
category of navigation and access. The functionality of the proposed
hierarchical summary browser includes dynamic audio-visual summary generation
following the user’s selection of the summary theme and summary length in
time. By allowing users to select preferred summary length, the hierarchical
level of the provided summary can be automatically selected so that the length
of the summary is closest to the user’s request. By allowing users to select
preferred theme of the summary, audio-visual summaries of various length with
the selected theme can be dynamically generated, so that the user can select the
length. The combined selection of the themes and length are also available. Such
a hierarchical summary browser can be also used in accordance with the user
preference, so that the preferred theme and the length can be automatically
selected based on the user preference.
Category:
Application of the Segment DS and Graph DS.
Features:
ToC based Audio-Visual Content Navigation
Abstract/Detail* relation based Navigation
Cause/Effect** relation based Navigation
Abstract:
The ToC browser is based on the segment DS and the Graph DS. The ToC
browser interface provides tree-structured interface of the selected content so
that a user can select interested segment of the content. Each segment is
represented by a representative key frame, and the selected segment is
summarized by a list of key frames. Based on the abstract/detail and
cause/effect relationships defined using the Graph DS, a user can select
segments in the abstract/detail/cause/effect relation. The abstract/detail
relation provides two segments one of which is an abstract version of the other
and the latter is a detailed version of the former segment. The cause/effect
relation provides two events one of which causes the other and the latter is the
result of the former event.
*Abstract/detail
are proposed normative types of relations
**The
effect relation is equivalent to the result relation which is a proposed
normative relation type and the cause relation can be considered as the inverse
relation of the result
00M7V010: SmartEye
Image
Retrieval System with Relevance-Feedback based Image Characterization
Category:
Application of the MatchingHint DS.
Features:
Image retrieval using multiple descriptors with different weights.
Automatic learning MatchingHints by user’s feedback
Abstract: Generally, relevance feedback has been utilized only to refine the query conditions in image retrieval. However, in our Application, the usage of the relevance feedback is extended to the image database categorization so as to be accommodated to user independent image retrieval. In our approach, to guarantee a user-satisfactory performance, descriptors and the elements of the descriptors corresponding features of each image are weighted using the relevance feedback. We use the MatchingHint DS for weighting descriptors and elements of each descriptor based on color and texture descriptors. In addition, our system uses the appropriate learning method based on the reliability scheme preventing wrong learning from wrong feedback.
Contacts:
Heon Jun Kim, Ph.D.,
Email:
hjk@lge.co.kr
Senior
MTS
Also:
Kyoungro Yoon, Jin-Soo Lee, Jung-Min Song
MI
Group, Information Technology Lab.
LG
Corporate Institute of Technology, 16 Woomyeon-Dong, Seocho-Gu, Seoul, Korea
137-724
TEL:
+82 526 4132, FAX: +82 526 4852
00M7V011:
MPEG-7 Camera
In collaboration with the EPFL, FASTCOM Technology S.A. has developed, around its smart camera product, a MPEG-7 "standard" communication layer making searchable the content of its video output. We will present our ongoing development around MPEG-7 and the automatic metadata creation tools.
Contact:
Nicolas Pican
Email:
pican@fastcom-technology.com
FASTCOM
Technology S.A.
MPEG-7
in the 21st Century
00MP7IFG001: Short
Introduction
A panel discussion, in reflecting over the day’s presentations and demos, will discuss immediate issues concerning MPEG-7 applications for the marketplace, the MPEG-7 Industry Focus Group and plans for future activities.
Contact:
Neil Day and Witold Reichhart
Email: neil@garage.co.jp
Digital
Garage Inc.
Manager,
Strategic Research & Development Department,
Yamazaki
Bldg. 5F, 2-43-15 Tomigaya, Shibuya-ku,, Tokyo 151-0063, Japan.
Tel:
+81-3-5454-7213, Fax: +81-3-5454-7218
Digital Garage: http://www.garage.co.jp,
WebNation: http://www.webnation.co.jp
Witold
Reichhart
Email:
witold@starlab.net
Starlab
Research Laboratories
Boulevard St-Michel 47,, 1040 Brussels - Belgium
Tel
: +32 2 7400 740
WWW:
http://www.starlab.net
Panel
Discussion
Panel Host:
Neil Day
Panel Members:
Philippe Salembier, Rob Koenen, Eric Rehm, Vincent Puig, Munchurl Kim, Witold Reichhart.
First
MPEG-7 Awareness Event
Profiles
of Session Chairs and Presenters
Multimedia
Session
John Smith:
Session Chair
00M7M001
00M7M002
John R. Smith is currently Manager of the Pervasive Media Management Group at IBM T. J. Watson Research Center. His research interests include multimedia and multi-dimensional data management, compression, access and retrieval and content-based query systems. Dr. Smith is an active participant in the MPEG-7 Multimedia Description Schemes Group and is chairing the development of the MPEG-7 Conceptual Model. He received his M. Phil and PhD. degrees in Electrical Engineering from Columbia University in 1994 and 1997, respectively. At Columbia, he developed several image and video search and retrieval systems, including the WebSEEk image and video search engine, the VisualSEEk content-based image retrieval system. At IBM, he has developed a progressive video retrieval system called VideoZoom, and a new framework for adaptive compression, access and retrieval of large images, high-resolutions documents and maps. Dr. Smith received the Eliahu I. Jury award from Columbia University for outstanding achievement as a graduate student in the areas of systems communication or signal processing. Dr. Smith is an Adjunct Professor at Columbia University and a member of IEEE.
Mikio Sasaki
00M7M003
Mikio Sasaki is Project Leader in Research Laboratories of DENSO CORPORATION which is the largest automotive parts company in Japan. Mr. Sasaki is now in charge of R&D for related media processing technologies used in IT equipment such as car navigation, mobile phone, etc. He is especially engaged in the development of human-machine interfaces and media communication-based speech dialogue agents and their related data expressions such as MPEG-7. He is also very much interested in image understanding and has two US patents for 3D recognition for robotics and image coding.
He received a BS in Electronics from Kyoto University, an MS in Electronics from the University of Tokyo. Until October 1991, he had been working for YAMAHA, the well-known Japanese maker of musical instruments and was also engaged in R&D at EMI for related digital circuits and MPEG-1 related image technologies.
Eric Rehm
Co-Chair of MPEG-7AE
Panel Member
00M7M004
Eric Rehm is cofounder and Chief Technical Officer of Singingfish.com. Singingfish.com has developed, streaming media search services that are marketed and licensed and marketed to a broad range of high-traffic web portals, search and directory sites, broadband service providers, content aggregators, news organizations, entertainment networks and others. Eric is responsible for creating the company's technical vision and putting in place the system architecture that enables Singingfish.com search technology to function across PC, wireless, television and other computer and entertainment platforms.
Prior to founding Singingfish.com, Eric architected and implemented system software for Equator Technologies' MAP1000 multimedia processor. Before joining Equator, Eric served at Digital Equipment Corp. for nine years, where he worked as a principal engineer on the initial implementation of Windows NT on Digital's Alpha platforms.
Eric holds an MS in Computer, Information and Control Engineering from the University of Michigan, and a BS in Electrical Engineering from Purdue University. He's completed graduate work in Computer Scienceat University of Washington.
Stephan Herrmann
00M7M005
Affiliation: Munich University of Technology,
Institute for Integrated Circuits
Resume:
Stephan received in 1994 a Diploma degree in Electrical Engineering from the Berlin University of Technology.
In 1995 he joined the Heinrich Hertz Institute (HHI) in Berlin and since 1996 has been with the Institute for Integrated Circuits at the Munich University of Technology as a Research Assistant. His major interest is in algorithms, and hardware architectures for image analysis and image segmentation. Since 07.1999 Stephan has been Chairman of the MPEG-7 AHG for XM Development.
Audio
Session
Audio Session Chair Panel member
00M7A002
Marketing
Director, IRCAM - Centre Pompidou
With a first
background in Business Administration, he specialized in Technology Transfer and
went out to New York for two years as Commercial Attache at the French Embassy
in charge of electronics and software. In 1988, he became consultant at
Innovation 128, a company specialized in Technology monitoring and Technology
transfer. In 1993, he started a new activity at IRCAM-Centre Pompidou as
Marketing Director. At this time he created the Forum Ircam a software user
group intended for musicians, institutions and home studios found of computer
music. It is currently gathering more than 1300 users worldwide. In 1995 he set
up a telematic project for professional studios named Studio On Line which was
retained in the first « Information Highway » call for proposal of
the French Ministry of Industry and completed with Java/Corba technologies in
December 1998. It offers on the Web a large database of more than 120.000
samples of instruments together with unique online processing functions. In June
1998, for the first edition of the Ircam festival, he set up a collaboration
with CICV and ENSAM for the presentation of two virtual reality installation :
Icare (Yvan Chabanaud, Roland Cahen), the Cistercian Model (Catherine Ikam et
Louis Fléri). Then in the same context he presented Coney Island (Robin Bargar,
NCSA) in 1999 and in 2000 “Elle et la voix” (Catherine Ikam, Louis Fléri,
Pierre Charvet). From 1998 to 2000 he has been coordinator of a European working
group on content processing of music in relation with MPEG7 (CUIDAD - Esprit
program) and of a European group studying user needs and interfaces in musical
libraries, in collaboration with BNF (Harmonica - Telematics program). Within
CUIDAD he managed a team working on instrument timbre Core Experiment. In 1999,
he presented a new project on content-based audio and music retrieval called
CUIDADO which has been recently retained in the IST European call. This project
aims at developing a Music Browser with Sony France and a Sound Editing
environment with CreamWare (D) both using automatically extracted audio
descriptors. The Music browser project has been supported by the most important
copyright societies in Europe since it should provide tools for automatic music
recognition and monitoring of copyrighted recordings available on the Web.
Philip N. Garner, Wilson
S.C. Chiu
00M7A001
Philip N.
Garner is a researcher at Canon Research Centre Europe Ltd in the UK. His
research interests include speech recognition, pattern recognition and
statistics. Before joining CRE, Philip was at the Defense Evaluation and
Research Agency in Malvern, UK, and studied Electronic Engineering at
Southampton University. He is a chartered Engineer.
Wilson S.C.
Chiu is the Chief Software Architect at Canon Research Centre Europe Ltd in the
UK, his role involves devising application and system architectures, and the
design and implementation of prototype systems. Before joining CRE, Wilson
worked as a software engineer at Vickers Medelec Ltd on the development of
neurodiagnostic instruments. Wilson studied Electrical Engineering and received
a PhD in visualisation of speech production at Southampton University. He did
postdoctoral
work in 3D Ultrasound Imaging at St. Thomas' Hospital in London.
Youngmoo Kim 00M7A003
Youngmoo Kim is a PhD candidate in the Machine Listening Group of the MIT Media Lab where his research activities have focused on audio signal processing and digital audio coding. His primary research involves novel techniques for coding and synthesizing the singing voice. He is an active participant of the MPEG audio standards technical committee, having contributed to MPEG-4 audio and now MPEG-7. Youngmoo received an MS in Electrical Engineering and a MA in Music, both from Stanford University and a BS in Engineering and BA in Music, both from Swarthmore College.
Visual
Session
Munchurl Kim
Session Chair
Panel Member
00M7V003
Munchurl Kim has received the B.E. degree in Electronics from Kyungpook National University, Korea in 1989, and M.E. and Ph.D. degrees in Electrical and Computer Engineering from University of Florida, Gainesville, USA, in 1992 and 1996, respectively.
Since 1997 he has been with
Electronics and Telecommunications Research Institute (ETRI), Korea, where he is
currently in charge of developing data broadcasting technology with MPEG-4/7
applications. His research area includes visual information processing, data
broadcasting, and multimedia communications.
Karsten Muller
00M7V001
Karsten Muller or Mueller, (IEEE M'98), received the Dipl. Ing. Degree from the Technical University of Berlin, Germany, in 1997. He has been with the Heinrich-Hertz-Institute, Berlin, since 1996, where he is working on projects focused on motion and disparity estimation, representation of 2-D and 3-D shapes, and viewpoint synthesis. He has been involved in MPEG activities, creating, testing and cross-checking MPEG-7 visual descriptors. Currently he develops algorithms that combine 2D-image and 3D-object representation.
Rosa Ruiloba 00M7V002
Rosa Ruiloba prepares a PH.D in the Laboratoire d'Informatique de Paris 6. She obtained the diploma of Electronic Engineer from the Escuela Tecnica Superior de Ingenerios de Telecomunicaciones of Valladolid University (Spain) and, in 1997, a Post-Graduate Degree in Image and Artificial Intelligence in the Ecole Nationale Superieure des Telecommunications de Bretagne. Currently, she works on video-to-shots segmentation algorithms, their evaluation and
comparison and contributes to MPEG-7 on the Video Editing DS.
Philippe Joly 00M7V002
Philippe Joly is a professor assistant at the Laboratoire d'Informatique de Paris 6 where he heads a research group on multimedia indexing. He works mainly on video features extraction and on audiovisual content description. He obtained a PH.D in Computer Science in 1996 at the University of Toulouse III.
Erik Oltmans
00M7V004
Erik Oltmans works with the Telematica Institute in Enschede, the Netherlands. His work is concerned with applied research on metadata issues, interoperability, streaming technologies and content engineering. He chairs the MPEG-7 Adhoc group on Metadata Integration, and he is workpackage manager within the Dutch Video-over-IP project.
Paul Porskamp
00M7V004
Paul Porskamp is senior Application Developer at the Telematica Instituut in Enschede, the Netherlands. His work is concerned with content management with special interest in content value chains and the relation to web enabled systems and their architectures. He is responsible for all architecture issues in the Dutch Video-over-IP project, in which
distributed content production environments and distributed deployment environments play a central role.
Soojun Park
00M7V005
Affiliation:
ETRI (1994 ~ present)
Position: Senior Member of Engineering Staff
Education:
M.S. degree in Computer Science at Lehigh University, U.S.A.
B.S. degree in Biochemistry at the University of Iowa, U.S.A.
Interested areas:
MPEG-7, Content-based Information Retrieval, Natural Language Processing, Bio-Informatics.
Koiti HASIDA
00M7V006
Koiti HASIDA was born in 1958. He received his B.S. (1981), M.S. (1983), and D.S. (1986) degrees from the University of Tokyo. He joined ETL (Electrotechnical Laboratory) in 1986 and has been the director of Information Science Division since 1999. He was also affiliated with ICOT (Institute of New Generation Computer Technology)
from 1988 to 1992. His current research commitments include intelligent content and constraint-based natural language processing.
Katashi NAGAO 00M7V006
Katashi NAGAO was born in 1962. He received his B.E.
(1985), M.E. (1987), and D.E. (1994) degrees from Tokyo Institute of Technology.
He joined IBM Tokyo Research Laboratory in 1987 and Sony Computer Science
Laboratory in 1991. Currently, he is a senior researcher at IBM Tokyo Research
Laboratory and conducting a research project on Semantic Transcoding of online
contents and Semantic Discovery from semantically-annotated data. His major
interests include natural language processing, human-computer interaction, and
intelligent agents and robots.
Ana B. Benitez
00M7V007
Ana B. Benitez is a Ph.D. candidate in the Department of Electrical Engineering at Columbia University, New York, USA, since 1996. She received her Telecommunications Engineer degree from the Polytechnic University of Catalonia (UPC) in Barcelona, Spain, in 1996. She was awarded a full scholarship for graduate studies in the United States by the Spanish financial institution, "la Caixa". She received her M. Phil. degree from Columbia University in 1996. Her current research interests include integration of large distributed multimedia information retrieval systems and multimedia content representation. She is an active participant in the MPEG-7 standard where she is the Chair of the AHG on MPEG-7 MDS Core Experiments and an Editor of the MDS experimental Model and Working Draft documents. She is also a student member of IEEE and ACM.
Heon Jun Kim, Ph.D 00M7V008 00M7V009 00M7V010
Heon Jun Kim received his B.E. degree in Metallurgical Engneering from Yonsei University in 1988, M.S. and Ph.D. degrees of Computer Science from Stevens Institute of Technology in 1996. Once he was a research engineer of Media Communication Lab. in LG Electronics and developed a face recognition system and image retrieval engine. Currently, he works for LG Electronics Institute of Technology as a Senior Member of the Technical Staff. Now he takes charge of Content based Multimedia Information Processing Project and participates in MPEG-7 standardization.
Kyoungro Yoon, Ph.D 00M7V008 00M7V009 00M7V010
Kyoungro Yoon is a Member of Technical Staff of LG Electronics Institute of Technology (LG Elite) since 1999. He received his Ph.D. degree in Computer Science from Syracuse University, Syracuse, NY in 1999. He received his M.S.E. in Electrical Engineering from the University of Michigan, Ann Arbor, MI in 1989 and his B.S. in Electronic and Computer Engineering from the Yonsei University, Seoul, Korea in 1987.
From 1989 to 1993, he worked for major Korean Companies as an IT consultant. Since he joined LG Elite in 1999, he has been actively participating in MPEG-7 standard and related projects such as TV Anytime Forum and audio/video content analysis projects. His main research interests includes audio/video content analysis and multimedia/hypermedia databases. He is also a member of IEEE and ACM.
Jin-Soo Lee
00M7V008
00M7V009
00M7V010
Jin-Soo Lee received his B.E. degree in Computer Engineering from Dongguk University in 1995, and M.S. degrees of Computer Science from Pohang University Science and Technology(POSTECH) in 1997. Once he was a research engineer of Media Communication Lab. in LG Electronics and developed a face detection system and image retrieval engine. Currently, he works for LG Electronics Institute of Technology as a Member of the Technical Staff. Now he is working on Content Based Multimedia Processing project and participating in MPEG-7 standardization.
Jungmin
Song
00M7V008
00M7V009
00M7V010
Jungmin Song received the M.S. degree in electronic engineering from Yonsei University in Seoul, Korea, in 1997. He was received a scholarship for graduate and undergraduate studies by LG Electronics. He is a member of Technical Staff in LG Electronics Institute of Technology in Seoul, Korea, since 1997. He is working on Content Based Multimedia Processing project and participating in MPEG-7 standardization.
Dr. Nicolas Pican
00M7V011
Born in 1965 in France, Nicolas Pican is received a BS Eng. and M.Sc. in Computer Science from the Compiegne Technology University (UTC) in 1990. He received his Ph.D. in Computer Science from the Nancy I University in 1995 for his contribution in the field of Artificial Neural Networks. Two years at INRIA (Nancy, France) and one year in Heriot-Watt University (Edinburgh, Scotland) were followed by a position as Image Processing specialist in a Machine Vision company at Cambridge, UK. He joined FASTCOM Technology S.A. in July 2000 as Multimedia Project Manager.
MPEG-7
in the 21st Century
MPEG-7 AE Chair and
Panel Host:
Neil Day
Neil Day recently moved to a pioneering Web solutions
company in Tokyo called Digital Garage, (http://www.garage.co.jp/el/index_html),
where he heads the R&D department. He's currently exploring the uses of
MPEG-7 and XML-related technologies in Internet entertainment applications for
the Japanese and international markets. Neil holds an engineering degree from
Trinity College, Dublin. He can be contacted at: neil@garage.co.jp
Panel Members:
Philippe Salembier
Philippe Salembier received a degree from the Ecole Polytechnique, Paris, France, in 1983 and a degree from the Ecole Nationale Supieure des Telecommunications, Paris, France, in 1985. He received the Ph.D. from the Swiss Federal Institute of Technology (EPFL) in 1991. He was a Postdoctoral Fellow at the Harvard Robotics Laboratory, Cambridge, MA, in 1991. At the end of 1991, after a stay at the Harvard Robotics Lab., he joined the Polytechnic University of Catalonia , Barcelona, Spain, where he is currently associate professor. He is lecturing on the area of digital signal and image processing. In terms of standardization activities, he is involved in the definition of the MPEG-7 standard ("Multimedia Content Description Interface") as chair of the "Multimedia Description Scheme" group. He has also co-edited a special issue of Signal processing: Image Communication on MPEG-7 proposals (2000). Currently, he is deputy-editor of Signal Processing.
Finally, he is member of the Image and Multidimensional Signal Processing Technical Committee of the IEEE Signal Processing Society.
Rob Koenen
Rob Koenen received his 'Ingenieur' (MSEE) degree from Delft University of Technology, the Netherlands, in 1989. He studied electrical engineering, specialising in information theory.
In 1990, he joined KPN Rsearch, where he has researched various aspects of audiovisual communication, working as a project manager and later as the coordinator of the Video Group within KPN Research. His projects have addressed: image coding research, audiovisual communication for people with special needs, interactive broadband multimedia for residential users, mobile multimedia, the strategic deployment of new multimedia services, audiovisual quality assessment and multimedia standardisation.
As an MPEG delegate, he has played a key role in the development of the MPEG-4 standard since its inception in 1993, and in defining the upcoming MPEG-7 standard since the start, in 1995.
Since October 2000, ir. Koenen now works as Director of Consumer Appliance Technology for InterTrust Technologies International. He is the chairman of the MPEG Requirements subgroup, and President of the MPEG-4 Industry Forum.
He is also associate editor of IEEE Transactions on Circuits and Systems for Video Technology.
Witold Reichhart
Throughout his life Witold Reichhart has been active in Music and Science. He has a background in mathematics, astrophysics and computer science. He has also an extensive career as a concert pianist. He studied music and musicology in Krakow at the State Academy of Music and in London at the Royal College of Music and the Royal Holloway College. At Starlab, he is working in the following fields: digital media, human / machine interaction, bio-sensing, bioaesthetics, and art. He is also involved in the creation of two consortia: one concerned with research into emotions and the other devoted to the future of the media. Witold has also been responsible for the creation of the MPEG-7 website, www.mpeg-7.com, a platform for the promotion of exchange of ideas, information and application requirements between MPEG-7 developers and industry participants. He is a prime force in the movement to build bridges and make connections between Art and Science.