AbstractsComputer Science

A learning framework for higher-order consistency models in multi-class pixel labeling problems

by Kyoungup Park

Institution: Australian National University
Year: 2014
Keywords: Markov random field ; Structured Support Vector Machine ; dual-decomposition ; semantic segmentation ; pixel labeling problems ; inference ; learning
Record ID: 1039662
Full text PDF: http://hdl.handle.net/1885/12686


Recently, higher-order Markov random field (MRF) models have been successfully applied to problems in computer vision, especially scene understanding problems. One successful higher-order MRF model for scene understanding is the consistency model [Kohli and Kumar, 2010; Kohli et al., 2009] and earlier work by Ladicky et al. [2009, 2013] which contain higher-order potentials composed of lower linear envelope functions. In semantic image segmentation problems, which seek to identify the pixels of images with pre-defined labels of objects and backgrounds, this model encourages consistent label assignments over segmented regions of images. However, solving this MRF problem exactly is generally NP-hard; instead, efficient approximate inference algorithms are used. Furthermore, the lower linear envelope functions involve a number of parameters to learn. But, the typical cross-validation used for pairwise MRF models is not a practical method for estimating such a large number of parameters. Nevertheless, few works have proposed efficient learning methods to deal with the large number of parameters in these consistency models. In this thesis, we propose a unified inference and learning framework for the consistency model. We investigate various issues and present solutions for inference and learning with this higher-order MRF model as follows. First, we derive two variants of the consistency model for multi-class pixel labeling tasks. Our model defines an energy function scoring any given label assignments over an image. In order to perform Maximum a posteriori (MAP) inference in this model, we minimize the energy function using move-making algorithms in which the higher-order problems are transformed into tractable pairwise problems. Then, we employ a max-margin framework for learning optimal parameters. This learning framework provides a generalized approach for searching the large parameter space. Second, we propose a novel use of the Gaussian mixture model (GMM) for encoding consistency constraints over a large set of pixels. Here, we use various oversegmentation methods to define coherent regions for the consistency potentials. In general, Mean shift (MS) produces locally coherent regions, and GMM provides globally coherent regions, which do not need to be contiguous. Our model exploits both local and global information together and improves the labeling accuracy on real data sets. Accordingly, we use multiple higher-order terms associated with each over-segmentation method. Our learning framework allows us to deal with the large number of parameters involved with multiple higher-order terms. Next, we explore a dual decomposition (DD) method for our multi-class consistency model. The dual decomposition MRF (DD-MRF) is an alternative method for optimizing the energy function. In dual decomposition, a complex MRF problem is decomposed into many easy subproblems and we optimize the relaxed dual problem using a projected subgradient method. At convergence, we expect a global optimum in the dual space because it is…