Generative Adversarial Networks

Link to original paper: https://arxiv.org/abs/1406.2661

What are Generative Adversarial Networks (GAN)?

Deep learning has resulted in development of discriminative models capable of mapping high-dimensional, rich sensory inputs to a class label. For example, discriminative models can learn to differentiate and label images of animals. The successes can be attributed to algorithms like backpropagation and dropout, often utilising piecewise linear units with well-behaved gradients such as ReLU.

In contrast to discriminative models, deep generative models have had less impact. Generative models aim to model the underlying probability distribution of the data, allowing for the generation of new samples that resemble the original dataset. However, deep generative models face several challenges:

Complexity in Probabilistic Computations: Probabilistic computations involving techniques like maximum likelihood estimation can become intractable (cannot be solved by a polynomial-time algorithm) as the complexity of the model increases
Difficulty Leveraging Piecewise Linear Units: It is difficult to leverage piecewise linear units like ReLU in generative models.

Image from Al Gharakhanian

In this paper, the authors introduce the concept of adversarial networks where a generative model is pitted against an adversary: a discriminative model that learns to determine whether a sample is produced by the generative model or from the actual data distribution. The analogy used is that the generative model represents a team of counterfeiters trying to produce fake currency and use it without detection, while the discriminative model is like the police, trying to detect the counterfeit currency. Competition results in both teams to improve their methods until generated samples are indistinguishable from data samples.

This competition between the two networks is the primary training criterion. GANs operate as a minimax game where one player’s gain is the other player’s loss, and the goal is to minimise the maximum possible loss. The objective of the game is to reach a strategic equilibrium, known as a saddle point, in their value functions.

GAN generates samples by passing random noise through the multilayer perceptron of the generative model. This introduces latent variables and allows the generative model to learn features that are robust to variations in the input. Both models are trained using only backpropagation and dropout algorithms, requiring only forward propagation for sampling from the generative model, without the need for approximate inference (approximating complex probability distributions for intractable computations using techniques such as Markov Chain Monte Carlo). The lack of feedback loops (recurrent layers) make GAN better at leveraging piecewise linear units.

Notation

To minimise confusion, here are the notation used throughout this page:

$x$ : Real data
$z$ : Latent/ Noise vector
$G(z)$ : Generated data
$D(x)$ : Discriminator's evaluation of real data