Super Resolution Image Enhancement

UW Madison CS766 - Computer Vision, Spring 2020

Prev: Main

Introduction

Generating super-resolution images from low resolution images has been used in medical imaging [Pham, 2019], [Georgescu, 2020], astronomy imaging [Zhang, 2019] and security imaging [Yang, 2019]. Where small or blurry objects need to be identified, a higher resolution image may increase the performance of existing object recognition algorithms. If super-resolution techniques deliver on their promises efficiently, we can transparently substitute lower resolution or more highly compressed images for costly high resolution images. Image storage, network transmission and video encoding/decoding are several examples of the utility of this. Super-resolution is an interesting task in and of itself, but we propose this project as a way to better understand and explore the potential for super-resolution networks to occupy one step in a pipeline for object detection or semantic segmentation tasks. The ability to locate objects potentially with sub-pixel precision in an image has interesting future applications for photogrammetry and metrology. To this end, our proposal centers on the implementation and application of super-resolution.

State of the Art

A recent and comprehensive overview of state of the art super-resolution algorithms and network structures can be found in [Yang, 2019]. A short discussion of key points as well as important notes will be given here as they relate to the project proposal. This discussion will cover generative adversarial networks (GANs) versus supervised learning for super resolution as well as a well-studied network structure that has been shown to work well in image enhancement architectures: ResNet [He, 2016].

The problem of super-resolution (SR) is in the non-uniqueness of a high-resolution (HR) image generated from a lower-resolution (LR) image. For any LR there are multiple plausible HR that would be faithfully represented by the LR. One way to generate data for SR is to down sample an HR image. Unfortunately, generating training samples from down sampling can lead to small artifacts in the network when trained to directly undo the down sampling algorithm [Yang, 2019]. One way of circumventing this is to use an unsupervised learning approach so as not to unwittingly compute a mapping from input-output samples, but instead to compute a mapping from a distribution of inputs to a distribution of outputs. Generative adversarial networks accomplish exactly this and have been shown to work well for SR [Ledig, 2017]. While GANs would improve SR on real-world low-resolution images, we can explore SR in a more efficient manner when we have more control of the datasets and training as is the case with the supervised approach. State of the art results are shown from a supervised method in [Lim, 2017].

In super-resolution, the network architecture plays an important role in the accuracy and efficiency of the network. Many image processing network architectures (including SR) are built on a well-studied convolutional network called ResNet. This network uses a recursive structure to learn small changes in the image. The network is made up of residual blocks with a fraction of the input added directly to the output of a later block. These connections are known as skip connections and are a fundamental component of the ResNet architecture [He, 2016] to reduce the complexity of the loss surface and reduce convergence to local minima. These residual blocks, along with up-scaling via deconvolution form the basis of many SR network architectures [Yang, 2019] and are detailed in the implementation given in [Lim, 2017].

Prev: Main

Up: Main

Next: Building the SR Network

Asher Elmquist (amelmquist@wisc.edu), Eric Brandt (elbrandt@wisc.edu) 2020

CS766_Project