Intel Video Metadata Framework SDK

Updated on 23-Apr-2015

1. Introduction

The Intel® Video Metadata Framework SDK (VMF) is a cross-platform SDK for C++ programmers. Intel VMF SDK provides a set of classes that simplify the tasks of managing metadata in video-related applications. Those tasks include metadata definition, creation, querying, saving and loading, etc.

The VMF SDK currently supports multiple operating systems and processor architectures:

  • Windows* on x86 and x64
  • Android* on ARM* and x86
  • iOS*
  • Linux* on x86, x64, MIPS, and ARM

2. Features

The VMF SDK provides a software framework for digital storytelling applications to manage metadata within video files. The features provided by the SDK include:

  • Metadata definition and creation
  • Metadata saving and loading within video files
  • In-memory metadata querying
  • Metadata sharing and merging

2.1 Metadata Definition and Creation

Metadata is data about data. The format of metadata can be: a single value integer, an array of strings, or a complex structure that contains multiple fields, each of which has a different name and value type. The VMF SDK allows developers to define metadata of any form and allows different types of metadata to be managed by a uniform metadata interface.

The VMF SDK also provides a mechanism that allows metadata to be defined before using. A metadata schema declares metadata formats by defining a set of metadata descriptors. A metadata descriptor defines the name of a metadata type and the names and value types of all the fields within the metadata. Each instance of metadata is then created based on the associated metadata descriptor, using the descriptor as a template to create the internal fields.

The metadata schema mechanism enables sharing of metadata between multiple applications when the same set of metadata schemas are used.
The metadata schema mechanism also allows metadata defined by one ISV to be used by applications created by another ISV.

2.2 Metadata Saving and Loading

Once metadata are created for a video sequence or video file, the VMF SDK allows the metadata to be saved within the video, by embedding metadata as part of the video file data. Metadata is carried along with the video when transferring the video from one device to another, or from a device to the Internet. Since the metadata is embedded within the video file, there is no need to use any auxiliary file for metadata.

Please note that VMF does not generate a new video format for storing the metadata and video, nor does it break the existing video format standards. The video files containing VMF metadata are all standard video formats that can be opened by any video player. The VMF supports most commonly used video formats: AVI, mp4, and WAV.

When loading metadata from a video file, VMF allows the metadata to be loaded all at once, or just specified metadata to be loaded.

2.3 In-Memory Metadata Querying

The VMF SDK provides multiple querying functions that allow developers to build complex queries. Some of the functions allow developers to query about the metadata type or value. Others allow developers to query the relationships between metadata. The generic query function provided by VMF allows developers to define their own query logic.

The VMF SDK allows multiple querying functions to be used in a sequential way: where early query results feed later querying functions with the effect of refining the results.

2.4 Metadata Shifting and Merging

Digital storytelling applications, like other video editing applications, usually need to trim video sequences by cutting off frames from original video files to remove unnecessary video segments. Video summarization applications need to extract video segments from raw video sequences and for a new video.

Those video editing operations (cutting, shifting, merging) typically affect the metadata embedded in the original videos. Some metadata needs to be associated with different frame indices due to the change of the frame indices. Other metadata may become invalid due to removal of the video frames associated with the metadata.

The VMF SDK provides functions that allow developers to shift metadata within a video sequence after trimming a video. The SDK also allows metadata from multiple videos to be merged into a new metadata stream after merging multiple video files into a new video.

3. Applications

This section describes typical ways of using VMF in digital storytelling applications.

3.1 Single Application

Most video editing or digital storytelling applications can benefit from using VMF SDK. The SDK:

  • Provides storage of metadata and I/O of metadata to/from video files.
  • Allows DST applications to update metadata based on changes made to video frames. The changes can be trimming or shifting the video.
  • Allows DST applications to import metadata from one video to another video or merge metadata from multiple videos to a new video.
  • Allows DST applications to sort and query metadata.

The following graph illustrates these benefits in a typical stand-alone digital storytelling application. The graph is split into three layers: the top layer consists of the DST application logic or algorithms; the middle layer is the VMF library; the bottom layer represents the media files or in-memory media objects.

3.2 Creator-Consumer Model

The creator-consumer model is a more common architecture for digital storytelling applications to use metadata. In this model, two applications are involved: a metadata creator app and a metadata consumer application or digital storytelling application.

The metadata creator app is an app running on mobile devices, such as smartphones, digital cameras, and digital camcorders. The app captures video from the camera and embeds metadata collected from sensors on the mobile device. The metadata can be raw information from sensors, such as GPS locations, temperatures, and speed. The metadata can also be information detected by the app from the video or other metadata information, such as human face, scene type, etc.

The digital storytelling application is the metadata consumer that imports videos with metadata embedded from the Internet and creates digital stories by performing additional operations, such as detecting new metadata, editing, and querying. The metadata consumer application usually runs on computers with high computation power and better user input devices, such as tablet PCs, laptops, or desktop computers.
It is important that the creator and consumer applications share the same metadata schema when creating or loading metadata.
The following graph illustrates the architecture of the creator-consumer model:

3.3 Metadata Ecosystem

In a more general form, multiple metadata creators and multiple metadata consumers are involved in exchanging information through metadata. Applications can consume metadata created by other generators or add new metadata to videos that already contain metadata from other applications. Schemas are defined and shared to ensure that the metadata generators and metadata consumers understand each other. The application vendors, schemas, and metadata together form the metadata ecosystem (Figure 3).

In a healthy metadata ecosystem, schemas defined by popular applications become de facto standards that other application developers want to implement. Thus more and more applications can share metadata freely.

Metadata ecosystem also brings new user experience to applications. For example, a social app can allow users to query people from videos based on metadata detected by other digital storytelling applications.

For more such Android resources and tools from Intel, please visit the Intel® Developer Zone

Source: https://software.intel.com/en-us/articles/intel-video-metadata-framework-sdk

Connect On :