Font size

  • S
  • M
  • L

Laboratory of Media Dynamics
Graduate School of Information Science and Technology
Hokkaido University

About the Laboratory of Media Dynamics


Search for Knowledge Discovery Processes through Full Use of Multimedia Technology

Graduate School of Information Science and Technology
(School of Engineering, Department of Electronics and Information Engineering)
Professor Miki Haseyama

Research areas:
Image & video processing, audio processing, music processing, knowledge creation
Research keywords:
Image restoration, image recognition, video semantic understanding, image & video search, music search
Educational background: Ph. D., Hokkaido University

What are our goals?

We are aiming for the realization of next-generation multimedia systems that can understand images and video as well as humans can. Due the expansion of the high-speed Internet and large capacity storage media, we are now surrounded by multimedia such as images and video. We expect that people from around the world to gradually contribute to this data, and that by 2020, approximately 35 zeta bytes (a zeta byte is 1021 bytes) of data will have been created. To understand how much that is, imagine saving all that data to DVDs and then stacking them one of top of another. The stack will reach halfway to Mars! Furthermore, we know that the total amount of data produced by everyone on the earth has already exceeded the total data storage capacity (including devices such as hard disks and USB memory). This means that often, important information contained in a large volume of data is discarded without storage. Therefore, in order to be able to save information that is valuable to us, we need technology that can automatically search for this information. Our goal, that is, “the realization of next-generation multimedia systems that can understand images and video as well as humans can,” provides just that kind of technology.

What sort of things are we creating?

In order to realize a system that can understand images and video as well as humans can, we need to create various methods, for example: recognition methods based on human visual and auditory systems; music and audio signal analysis methods; restoration methods; and next-generation coding methods (see Figs. 1 and 2 for examples). Additionally, in parallel with creating these methods, we need to understand how to obtain knowledge from the images, videos and audio that we create. We contribute to this by investigating human knowledge discovery processes from a multimedia signal processing perspective.

Figure 1: An example of image recognition (face recognition).

Figure 2: An example of image restoration (the regions occupied by the birds have been estimated).

For example, image search systems developed in the last 5 years (see Figs. 3 and 4) all consider how exactly the user is going to find their desired image. This is significantly different to conventional image search systems, which ask users to enter a keyword, and then perform the search using that keyword. Newer systems examine the individual image features to calculate a numerical tag for each image, in advance. When the user begins their search, it is almost as if the images begin “communicating” with each other. Images with similar numerical tags move closer together, and images with different numerical tags move further apart. While looking at the large volume of moving images, the user summons their own personal knowledge and memories, realizes previously unknown image relationships and finds the image that they really want.

Figure 3: An associative image search system (Image Vortex).

Figure 4: A “bird’s-eye view” image search system.

We have implemented the Image Vortex image search system, and it is now part of the Image Cruiser product (http://imagecruiser.jp/). Furthermore, we are not focusing on only images – we’ve also developed an intuitive associative video search engine that doesn’t rely on keywords (Fig. 5), and a cross-media search engines that allow users to freely obtain music, still images and video based on each user’s individual tastes (Fig. 6). This allows users to search for music using an image, or vice versa – something that was previously impossible. It also allows users to receive recommendations based on networks of other users with similar or radically different tastes. People feel great emotion and joy when they discover new things. In the near future, we are expecting the birth of a new search system – one that take us from all the existing image, video and music information and lead us to new and exciting discoveries, enriching the quality of our lives.

Figure 5: An associative video search engine (Video Vortex)

Figure 6: Cross-media search engine (Tri-Media Vortex)

What is the joy of being a researcher?

My research, the analysis of knowledge discovery processes, involves searching for new answers to the question: “How can we lead people to new discoveries and motivations?” I expect that if the obtained knowledge can support many people, it will contribute to the improvement of productivity in our society. By searching intelligently, we contribute not only to academic discoveries, but also revitalize people’s lifestyle and the industry, and contribute to society as a whole. This is a key motivating factor for researchers. Knowing that your own research achievements are making life better for everyone else – that is the joy of being a researcher.