MPEG is planning standardizing technologies that will enable efficient and interoperable design of visual search applications. In particular we are seeking technologies for visual content matching in images or video. Visual content matching includes matching of views of objects, landmarks, and printed documents that is robust to partial occlusions as well as changes in vantage point, camera parameters, and lighting conditions.
There are a number of component technologies that are useful for visual search, including format of visual descriptors, descriptor extraction process, as well as indexing, and matching algorithms. As a minimum, the format of descriptors as well as parts of their extraction process should be defined to ensure interoperability.
It is envisioned that a standard for compact descriptors will:
- ensure interoperability of visual search applications and databases,
- enable high level of performance of implementations conformant to the standard,
- simplify design of descriptor extraction and matching for visual search applications,
- enable hardware support for descriptor extraction and matching in mobile devices,
- reduce load on wireless networks carrying visual search-related information.
It is envisioned that such standard will provide a complementary tool to the suite of existing MPEG standards, such as MPEG-7 Visual Descriptors. To build full visual search application this standard may be used jointly with other existing standards, such as MPEG Query Format, HTTP, XML, JPEG, JPSec, and JPSearch.
The requirements for the technology that MPEG intends to standardize are listed in Appendix A of this document.