Convex optimization for cosegmentation.

Authors
  • JOULIN Armand
  • BACH Francis
  • JORDAN Michael irwin
  • PONCE Jean
  • SCHMID Cordelia
  • GRAUMAN Kristen
  • SCHUURMANS Dale
Publication date
2012
Publication type
Thesis
Summary The apparent simplicity with which a human perceives his surroundings suggests that the process involved is partly mechanical, and therefore does not require a high degree of reflection. This observation suggests that our visual perception of the world can be simulated on a computer. Computer vision is the field of research devoted to the problem of creating a form of visual perception for computers. The computational power of computers in the 1950's did not allow for the processing and analysis of the visual data necessary to create a virtual visual perception. Recently, the computing power and storage capacity have allowed this field to really emerge. In two decades, computer vision has made it possible to answer practical or industrial problems such as the detection of faces, of people behaving suspiciously in a crowd or of manufacturing defects in production lines. On the other hand, little progress has been made in the emergence of non-task-specific virtual visual perception and the community is still facing fundamental problems. One of these problems is to segment an optical stimulus or an image into meaningful regions, objects or actions. Scene segmentation is natural for humans, but also essential to fully understand one's environment. Unfortunately it is also extremely difficult to reproduce on a computer because there is no clear definition of the "meaningful" region. Indeed, depending on the scene or the situation, a region can have different interpretations. Given a scene taking place in the street, we can consider that distinguishing a pedestrian is important in this situation, on the other hand his clothes do not necessarily seem so. If we now consider a scene taking place during a fashion show, a piece of clothing becomes an important element, thus a significant region. Here, we focus on this segmentation problem and we approach it from a particular angle to avoid this fundamental difficulty. We will consider segmentation as a weakly supervised learning problem, i.e. instead of segmenting images according to a certain predefined definition of "significant" regions, we develop methods to simultaneously segment a set of images into regions that appear regularly. We define a statistically significant region as the regions that appear regularly in the set of images. For this purpose we design models with a scope that goes beyond the application to vision. Our approach has its roots in statistical learning, whose goal is to design efficient methods to extract and/or learn recurrent patterns in data sets. This field has recently become very popular due to the increase in the number and size of available databases. We focus here on methods designed to discover "hidden" information in a database from incomplete or non-existent annotations. Finally, our work is rooted in the field of numerical optimization in order to develop efficient algorithms adapted to our problems. In particular, we use and adapt recently developed tools to relax complex combinatorial problems into convex problems for which the optimal solution is guaranteed. We illustrate the quality of our formulations and algorithms also on problems from other domains than computer vision. In particular, we show that our work can be used in text classification and in cell biology.
Topics of the publication
Themes detected by scanR from retrieved publications. For more information, see https://scanr.enseignementsup-recherche.gouv.fr