Font size

  • S
  • M
  • L

Laboratory of Media Dynamics
Graduate School of Information Science and Technology
Hokkaido University

Image Understanding

Video Scene Analysis Technology

The segmentation of a DVD video into chapters enables the efficient viewing of the video. In order to realize this, technology that automatically segments the video into scenes, which contain the same content or conversation, is extremely important. This technology is known as “scene segmentation”. Our methods focus on the image or audio features of a video. We detect segments depicting the same event or object. We also use a large collection of videos to extract common features that change frequently between videos, and train machine learning classifiers on those features.

Sport Video Analysis Technology

Sports like soccer and baseball are watched by many people worldwide. Recent research topics focus on technology that improves the experience for the viewers by estimating various data, evaluations and strategies regarding the match, and displaying this information to the viewers. More specifically, we look at the characteristics and rules of sports competitions; detecting the most important player for a team based on the detected player and ball positions; determining potential pass trajectories; and estimating the team’s overall strategy.

Acoustic Chord Estimation Technology

The chord, which consists of two differently pitched notes, is one of the important elements of music understanding. It is important to be able to effectively estimate chords from an audio signal. Furthermore, it is possible to estimate chords based on the result of estimation of individual notes. However, when the chord consists of many notes, the effectiveness of chord detection is insufficient. Therefore, we are currently researching effective methods of chord estimation by classifying an audio signal based on the rules of chord composition.

Technology for Extracting Objects from Images

We are researching methods for analyzing the meaning of images in order to perform image search that corresponds to the properties of the human visual system. For example, by extracting the object depicted in an image, it is possible to sort the image search results based on the objects they depict. Our methods enable object extraction by gathering a large collection of images with known contained objects, and training a classifier on common features.

Technology for Evaluating the Visualization of Image Search Results

We propose to arrange similar images close to each other when displaying the search results to the user. This allows the user to efficiently find the desired image from a large image collection. With that in mind, it’s important to know which display method yields the most search efficiency, and for which kinds of images. We are proposing a criterion for evaluating the search result display method based on the characteristics of the human visual system that are relevant during image searching.

Web Video Analysis (Hierarchy Extraction) Technology

Due to video sharing services such as YouTube, there is an enormous amount of videos on the Web. New methods are required to find desired videos efficiently. As part of our effort in this new direction, we are researching video search methods that extract communities of similar videos and their hierarchies.

Technology for Searching for the Most Truthful Version of a Video

We are surrounded by video, whether it is on the TV or the Web. Unfortunately, it isn’t always easy to obtain accurate information from such video. We are researching methods for identifying the most truthful version of the video – the version that is as close as possible to the unedited original. This will enable more effective video search methods in the future.

Music Recommendation Technology

Finding a song that you like from a large database such as the iTunes Store requires a lot of time and effort. We are investigating methods for recommending songs based on their audio signals. By analyzing a set of songs, which the user has pre-labelled as ones that they like, and looking for common audio features such as tone and sound composition, it is possible to identify new songs that the user is likely to enjoy.