Half-quadratic Inference and Learning for Natural Images.
Technische Universität, Darmstadt
[Ph.D. Thesis], (2017)
Available under CC-BY-NC-ND 4.0 International - Creative Commons Attribution Non-commercial No-derivatives 4.0.
Download (14MB) | Preview
|Item Type:||Ph.D. Thesis|
|Title:||Half-quadratic Inference and Learning for Natural Images|
Many problems in computer vision are ill-posed in the sense that there is no unique solution without imposing additional regularization or prior knowledge about the desired result. In this dissertation, we are particularly interested in the restoration of natural images, which aims at recovering a clean image from a corrupted observation, such as an image afflicted by noise or blur.
In a generative approach, it is common to separate modeling of the image prior (regularization term) and the likelihood (data term), where the latter describes the mathematical relationship between the true image and its corrupted observation. By using Bayes' rule, prior and likelihood give rise to the posterior distribution of the restored image, which can then be used to infer the restored image. Alternatively, since prior and likelihood themselves are not actually needed to infer the restored image, the posterior can also be directly modeled in a discriminative approach.
The problem of inference is then to predict a restored image based on the posterior, where it is most common to seek the image with highest posterior probability. Inference typically involves solving an optimization problem of some kind, which can be difficult or slow, especially for non-convex optimization problems which often arise when trying to accurately model image restoration problems. To alleviate this issue, a particular optimization strategy known as half-quadratic (HQ) inference by Geman et al. has proven to be very useful, where the model is first augmented with auxiliary variables. Inference then alternates between updating the restored image and the auxiliary variables, where both of these steps are relatively simple. Half-quadratic inference is a key component for all of the contributions put forward in this dissertation. Therefore, the first contribution is to provide a comprehensive review of HQ inference.
Our second contribution pertains to the issue that the likelihood often hinges on a few parameters (e.g., the strength of assumed image noise), which are specific to the images at hand in a given application. Since these parameters are important but mostly unknown in practice, we address this (often ignored) issue by proposing a sampling-based inference method that allows to estimate such parameters besides the restored image. Half-quadratic inference plays an important role to make our approach practical.
Devising good image priors is often difficult, especially because natural images (and related scene types) have a complex structure. We address this throughout this thesis by using flexible images models based on Markov random fields (MRFs) and (parameter) learning based on example data. However, instead of hoping to learn a model that (approximately) adheres to some known regularities of the data, sometimes it is desirable to explicitly incorporate domain knowledge into the model. As our third contribution, we address this issue by enforcing invariance to linear transformations in a commonly-used class of models. With a focus on rotations, we propose transformation-aware feature learning and demonstrate our learned models in two applications. First, we learn an image prior that enables translation- and rotation-equivariant image denoising. Second, we devise rotation-equi-/invariant image descriptors based on learned rotation-aware features that perform well for rotation-invariant object recognition and detection.
In the following, we revisit and analyze HQ inference and propose an effective discriminative generalization based on a cascade of Gaussian conditional random fields (CRFs). By learning the model and its associated inference algorithm in a single unit, we show that using only few cascade stages yields excellent results in image denoising and deblurring. In particular, we propose the first discriminative non-blind deblurring approach that works for arbitrary images and blurs.
Finally, we address the issue that many low-level vision algorithms cannot be applied to megapixel-sized images. Based on our discriminative generalization of HQ inference, our final contribution is to learn a particularly efficient model and inference combination that can be applied to large images in a very reasonable amount of time, without compromising on the quality of the restored images.
|Place of Publication:||Darmstadt|
|Classification DDC:||000 Allgemeines, Informatik, Informationswissenschaft > 004 Informatik|
|Divisions:||20 Department of Computer Science > Visual Inference|
|Date Deposited:||17 Mar 2017 14:35|
|Last Modified:||17 Mar 2017 14:35|
|Referees:||Roth, Prof. Dr. Stefan and Favaro, Prof. Dr. Paolo|
|Refereed:||16 December 2016|