GANs in computer vision - Introduction to generative learning

GANs in computer vision: Introduction to generative learning (part 1)

In this review article series, we will focus on a plethora of GANs for computer vision applications . Specifically, we will slowly build upon the ideas and the principles that led to the evolution of generative adversarial networks (GAN). We will encounter different tasks such as conditional image generation, 3D object generation, video synthesis.


1. Adversarial Learning

2. Vanilla GAN (Generative Adversarial Networks)

3. Conditional GAN (Conditional Generative Adversarial Nets)

4. DCGAN (Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks)

5. Info GAN: Representation Learning by Information Maximizing Generative Adversarial Nets

6. Improved Techniques for Training GANs

In general, data generation methods exist in a big variety of modern deep learning applications, from computer vision to natural language processing. At this point, we are able to produce nearly indistinguishable generative data by the human eye. Generative learning can be broadly divided into two main categories: a) Variational AutoEncoders (VAE) and b) generative adversarial networks (GAN).

Why not just autoencoders?

A lot of people are still wondering why researchers make things so difficult with GANs. Why don’t just use an autoencoder and minimize the mean squared error, having the target image to match the predicted one? The reason is that these models produce poor results (for image generation). Actually, just minimizing the distance produces blurry predictions because of the averaging. Remember the L1 or L2 loss is scalar, which is an average quantity of all the pixels. Intuitively, it is a similar concept as applying a smoothing filter that averages the pixel values based on the average. Second, it is impossible to produce diversity in such a way (like variational autoencoders). That’s why the attention of the research community adopted GAN models.

So let’s start our GAN review journey! But, let’s start from the very beginning: what is adversarial learning anyway ?

Adversarial Learning

We have experimentally validated that deep learning models are highly vulnerable to attacks that are based on small modifications of the input to the model at test time. Suppose you have a trained classifier that correctly recognizes an object in an image with the correct label.

It’s possible to construct an adversarial example, which is a visually indistinguishable image. These adversarial images can be constructed by noise perturbation. However, the image is classified incorrectly. To address this problem, a common approach is to inject adversarial examples into the training set ( adversarial training ). Hence, the neural network’s robustness can be increased. This type of example can be...