Chapter 4: Project Face Generation with GANs
4.2 Model Creation
Creating a GAN model involves designing both the generator and the discriminator. The generator's role is to produce realistic images from random noise, while the discriminator's role is to distinguish between real images from the dataset and fake images generated by the generator. By training these two networks adversarially, we aim to produce a generator capable of creating highly realistic facial images.
4.2.1 Building the Generator
The generator is a neural network that takes random noise as input and transforms it into a realistic image. For our face generation project, we'll use a deep convolutional generator. The architecture will include several layers of transposed convolutions, batch normalization, and activation functions to progressively upsample the input noise into a full-sized image.
Key Components:
- Dense Layer: The first layer will be a dense layer that takes the input noise and projects it into a higher-dimensional space.
- Reshape Layer: This layer reshapes the output of the dense layer into a 3D tensor suitable for convolutional operations.
- Transposed Convolutional Layers: These layers (also known as deconvolutional layers) will upsample the tensor to the desired image size.
- Batch Normalization: Batch normalization will be applied after each transposed convolution to stabilize and accelerate the training process.
- Activation Functions: We'll use LeakyReLU activations in hidden layers and Tanh activation in the output layer to ensure the pixel values are in the range [-1, 1].
Example: Generator Code
import tensorflow as tf
from tensorflow.keras.layers import Dense, Reshape, BatchNormalization, LeakyReLU, Conv2DTranspose
from tensorflow.keras.models import Sequential
def build_generator(latent_dim):
model = Sequential()
# Dense layer
model.add(Dense(256 * 8 * 8, activation="relu", input_dim=latent_dim))
model.add(Reshape((8, 8, 256)))
# Transposed convolutional layers
model.add(Conv2DTranspose(128, kernel_size=4, strides=2, padding='same'))
model.add(BatchNormalization())
model.add(LeakyReLU(alpha=0.2))
model.add(Conv2DTranspose(64, kernel_size=4, strides=2, padding='same'))
model.add(BatchNormalization())
model.add(LeakyReLU(alpha=0.2))
model.add(Conv2DTranspose(32, kernel_size=4, strides=2, padding='same'))
model.add(BatchNormalization())
model.add(LeakyReLU(alpha=0.2))
# Output layer
model.add(Conv2DTranspose(3, kernel_size=4, strides=2, padding='same', activation='tanh'))
return model
# Define the latent dimension (size of the random noise vector)
latent_dim = 100
generator = build_generator(latent_dim)
generator.summary()
The model is built using the Sequential API, starting with a dense layer that takes the latent_dim (size of the random noise vector) as input. The output is reshaped into a 8x8x256 tensor.
It then applies multiple layers of transposed convolution (Conv2DTranspose), each followed by batch normalization and a LeakyReLU activation. These layers progressively upsample the tensor to larger spatial dimensions.
The final Conv2DTranspose layer outputs a tensor with shape corresponding to an image, with 3 color channels (RGB), and tanh activation.
At the end, the generator's structure is printed using the summary() method.
4.2.2 Building the Discriminator
The discriminator is a neural network that takes an image as input and outputs a probability indicating whether the image is real (from the dataset) or fake (generated by the generator). For our project, we'll use a deep convolutional discriminator with several convolutional layers, batch normalization, and activation functions.
Key Components:
- Convolutional Layers: These layers will downsample the input image, extracting hierarchical features at different levels of abstraction.
- Batch Normalization: Applied after each convolutional layer to stabilize and accelerate training.
- Activation Functions: We'll use LeakyReLU activations in hidden layers to allow small negative gradients and a sigmoid activation in the output layer to produce a probability.
Example: Discriminator Code
import tensorflow as tf
from tensorflow.keras.layers import Conv2D, Flatten, Dense, LeakyReLU, BatchNormalization
from tensorflow.keras.models import Sequential
def build_discriminator(img_shape):
model = Sequential()
# Convolutional layers
model.add(Conv2D(64, kernel_size=4, strides=2, padding='same', input_shape=img_shape))
model.add(LeakyReLU(alpha=0.2))
model.add(Conv2D(128, kernel_size=4, strides=2, padding='same'))
model.add(BatchNormalization())
model.add(LeakyReLU(alpha=0.2))
model.add(Conv2D(256, kernel_size=4, strides=2, padding='same'))
model.add(BatchNormalization())
model.add(LeakyReLU(alpha=0.2))
model.add(Conv2D(512, kernel_size=4, strides=2, padding='same'))
model.add(BatchNormalization())
model.add(LeakyReLU(alpha=0.2))
# Output layer
model.add(Flatten())
model.add(Dense(1, activation='sigmoid'))
return model
# Define the image shape (e.g., 64x64 RGB images)
img_shape = (64, 64, 3)
discriminator = build_discriminator(img_shape)
discriminator.summary()
This example code defines a function that builds a discriminator model for a Generative Adversarial Network (GAN) using the TensorFlow and Keras libraries. The model has a series of convolutional layers, each followed by a LeakyReLU activation function and some include Batch Normalization.
These layers are used to extract features from the input images. The output layer is a dense layer with a single unit and a sigmoid activation function, which will output the probability of the input image being real. The model is then built with a specified image shape and the structure of the model is displayed.
4.2.3 Compiling the Models
Before training the GAN, we need to compile the discriminator and the combined GAN model. The discriminator will be compiled separately with a binary cross-entropy loss function and an optimizer. The combined GAN model, which includes the generator and the discriminator, will also be compiled with a binary cross-entropy loss function and an optimizer.
Compiling the Discriminator:
# Compile the discriminator
discriminator.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
The model will be trained to minimize a function called 'binary cross entropy' (commonly used in binary classification problems) and the accuracy of the model will be tracked as a metric during the training process.
Compiling the Combined GAN Model:
For the combined GAN model, we first need to freeze the discriminator's weights to ensure only the generator is trained during the combined model's training phase. The combined model takes noise as input, generates an image, and then evaluates the generated image using the discriminator.
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input
# Freeze the discriminator's weights during the combined model training
discriminator.trainable = False
# Create the combined GAN model
gan_input = Input(shape=(latent_dim,))
generated_img = generator(gan_input)
gan_output = discriminator(generated_img)
gan = Model(gan_input, gan_output)
gan.compile(optimizer='adam', loss='binary_crossentropy')
# Summary of the combined GAN model
gan.summary()
Here, the Discriminator's weights are frozen to prevent it from training during the combined model's training. The Generator takes an input (the latent dimension) and generates an image. This generated image is then passed to the Discriminator which classifies it as real or fake.
The combined GAN model is compiled with the 'adam' optimizer and 'binary_crossentropy' loss function, which is suitable for a binary classification problem.
Finally, a summary of the combined GAN model is displayed, offering an overview of the model's architecture and parameters.
4.2.4 Visualizing the Model Architectures
Visualizing the architectures of both the generator and discriminator can provide insights into their structures and help identify any potential issues.
Visualizing the Generator:
from tensorflow.keras.utils import plot_model
plot_model(generator, to_file='generator_model.png', show_shapes=True, show_layer_names=True)
The 'plot_model' function is used to create this visualization, which will be saved as 'generator_model.png'. The 'show_shapes' parameter is set to 'True' to display the dimensions of the model's layers, and 'show_layer_names' is set to 'True' to display the names of each layer in the model.
Visualizing the Discriminator:
plot_model(discriminator, to_file='discriminator_model.png', show_shapes=True, show_layer_names=True)
This line of code uses the plot_model
function from the Keras library to create a visualization of the structure of the 'discriminator' model. It saves this visualization as a .png file named 'discriminator_model.png'. The parameters 'show_shapes=True' and 'show_layer_names=True' indicate that the visualization should include the shapes of the model's layers and the names of each layer.
By successfully creating and compiling the generator, discriminator, and the combined GAN model, we have laid the groundwork for training our GAN. The next steps involve training the GAN on the CelebA dataset, monitoring its performance, and evaluating the quality of the generated images.
4.2 Model Creation
Creating a GAN model involves designing both the generator and the discriminator. The generator's role is to produce realistic images from random noise, while the discriminator's role is to distinguish between real images from the dataset and fake images generated by the generator. By training these two networks adversarially, we aim to produce a generator capable of creating highly realistic facial images.
4.2.1 Building the Generator
The generator is a neural network that takes random noise as input and transforms it into a realistic image. For our face generation project, we'll use a deep convolutional generator. The architecture will include several layers of transposed convolutions, batch normalization, and activation functions to progressively upsample the input noise into a full-sized image.
Key Components:
- Dense Layer: The first layer will be a dense layer that takes the input noise and projects it into a higher-dimensional space.
- Reshape Layer: This layer reshapes the output of the dense layer into a 3D tensor suitable for convolutional operations.
- Transposed Convolutional Layers: These layers (also known as deconvolutional layers) will upsample the tensor to the desired image size.
- Batch Normalization: Batch normalization will be applied after each transposed convolution to stabilize and accelerate the training process.
- Activation Functions: We'll use LeakyReLU activations in hidden layers and Tanh activation in the output layer to ensure the pixel values are in the range [-1, 1].
Example: Generator Code
import tensorflow as tf
from tensorflow.keras.layers import Dense, Reshape, BatchNormalization, LeakyReLU, Conv2DTranspose
from tensorflow.keras.models import Sequential
def build_generator(latent_dim):
model = Sequential()
# Dense layer
model.add(Dense(256 * 8 * 8, activation="relu", input_dim=latent_dim))
model.add(Reshape((8, 8, 256)))
# Transposed convolutional layers
model.add(Conv2DTranspose(128, kernel_size=4, strides=2, padding='same'))
model.add(BatchNormalization())
model.add(LeakyReLU(alpha=0.2))
model.add(Conv2DTranspose(64, kernel_size=4, strides=2, padding='same'))
model.add(BatchNormalization())
model.add(LeakyReLU(alpha=0.2))
model.add(Conv2DTranspose(32, kernel_size=4, strides=2, padding='same'))
model.add(BatchNormalization())
model.add(LeakyReLU(alpha=0.2))
# Output layer
model.add(Conv2DTranspose(3, kernel_size=4, strides=2, padding='same', activation='tanh'))
return model
# Define the latent dimension (size of the random noise vector)
latent_dim = 100
generator = build_generator(latent_dim)
generator.summary()
The model is built using the Sequential API, starting with a dense layer that takes the latent_dim (size of the random noise vector) as input. The output is reshaped into a 8x8x256 tensor.
It then applies multiple layers of transposed convolution (Conv2DTranspose), each followed by batch normalization and a LeakyReLU activation. These layers progressively upsample the tensor to larger spatial dimensions.
The final Conv2DTranspose layer outputs a tensor with shape corresponding to an image, with 3 color channels (RGB), and tanh activation.
At the end, the generator's structure is printed using the summary() method.
4.2.2 Building the Discriminator
The discriminator is a neural network that takes an image as input and outputs a probability indicating whether the image is real (from the dataset) or fake (generated by the generator). For our project, we'll use a deep convolutional discriminator with several convolutional layers, batch normalization, and activation functions.
Key Components:
- Convolutional Layers: These layers will downsample the input image, extracting hierarchical features at different levels of abstraction.
- Batch Normalization: Applied after each convolutional layer to stabilize and accelerate training.
- Activation Functions: We'll use LeakyReLU activations in hidden layers to allow small negative gradients and a sigmoid activation in the output layer to produce a probability.
Example: Discriminator Code
import tensorflow as tf
from tensorflow.keras.layers import Conv2D, Flatten, Dense, LeakyReLU, BatchNormalization
from tensorflow.keras.models import Sequential
def build_discriminator(img_shape):
model = Sequential()
# Convolutional layers
model.add(Conv2D(64, kernel_size=4, strides=2, padding='same', input_shape=img_shape))
model.add(LeakyReLU(alpha=0.2))
model.add(Conv2D(128, kernel_size=4, strides=2, padding='same'))
model.add(BatchNormalization())
model.add(LeakyReLU(alpha=0.2))
model.add(Conv2D(256, kernel_size=4, strides=2, padding='same'))
model.add(BatchNormalization())
model.add(LeakyReLU(alpha=0.2))
model.add(Conv2D(512, kernel_size=4, strides=2, padding='same'))
model.add(BatchNormalization())
model.add(LeakyReLU(alpha=0.2))
# Output layer
model.add(Flatten())
model.add(Dense(1, activation='sigmoid'))
return model
# Define the image shape (e.g., 64x64 RGB images)
img_shape = (64, 64, 3)
discriminator = build_discriminator(img_shape)
discriminator.summary()
This example code defines a function that builds a discriminator model for a Generative Adversarial Network (GAN) using the TensorFlow and Keras libraries. The model has a series of convolutional layers, each followed by a LeakyReLU activation function and some include Batch Normalization.
These layers are used to extract features from the input images. The output layer is a dense layer with a single unit and a sigmoid activation function, which will output the probability of the input image being real. The model is then built with a specified image shape and the structure of the model is displayed.
4.2.3 Compiling the Models
Before training the GAN, we need to compile the discriminator and the combined GAN model. The discriminator will be compiled separately with a binary cross-entropy loss function and an optimizer. The combined GAN model, which includes the generator and the discriminator, will also be compiled with a binary cross-entropy loss function and an optimizer.
Compiling the Discriminator:
# Compile the discriminator
discriminator.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
The model will be trained to minimize a function called 'binary cross entropy' (commonly used in binary classification problems) and the accuracy of the model will be tracked as a metric during the training process.
Compiling the Combined GAN Model:
For the combined GAN model, we first need to freeze the discriminator's weights to ensure only the generator is trained during the combined model's training phase. The combined model takes noise as input, generates an image, and then evaluates the generated image using the discriminator.
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input
# Freeze the discriminator's weights during the combined model training
discriminator.trainable = False
# Create the combined GAN model
gan_input = Input(shape=(latent_dim,))
generated_img = generator(gan_input)
gan_output = discriminator(generated_img)
gan = Model(gan_input, gan_output)
gan.compile(optimizer='adam', loss='binary_crossentropy')
# Summary of the combined GAN model
gan.summary()
Here, the Discriminator's weights are frozen to prevent it from training during the combined model's training. The Generator takes an input (the latent dimension) and generates an image. This generated image is then passed to the Discriminator which classifies it as real or fake.
The combined GAN model is compiled with the 'adam' optimizer and 'binary_crossentropy' loss function, which is suitable for a binary classification problem.
Finally, a summary of the combined GAN model is displayed, offering an overview of the model's architecture and parameters.
4.2.4 Visualizing the Model Architectures
Visualizing the architectures of both the generator and discriminator can provide insights into their structures and help identify any potential issues.
Visualizing the Generator:
from tensorflow.keras.utils import plot_model
plot_model(generator, to_file='generator_model.png', show_shapes=True, show_layer_names=True)
The 'plot_model' function is used to create this visualization, which will be saved as 'generator_model.png'. The 'show_shapes' parameter is set to 'True' to display the dimensions of the model's layers, and 'show_layer_names' is set to 'True' to display the names of each layer in the model.
Visualizing the Discriminator:
plot_model(discriminator, to_file='discriminator_model.png', show_shapes=True, show_layer_names=True)
This line of code uses the plot_model
function from the Keras library to create a visualization of the structure of the 'discriminator' model. It saves this visualization as a .png file named 'discriminator_model.png'. The parameters 'show_shapes=True' and 'show_layer_names=True' indicate that the visualization should include the shapes of the model's layers and the names of each layer.
By successfully creating and compiling the generator, discriminator, and the combined GAN model, we have laid the groundwork for training our GAN. The next steps involve training the GAN on the CelebA dataset, monitoring its performance, and evaluating the quality of the generated images.
4.2 Model Creation
Creating a GAN model involves designing both the generator and the discriminator. The generator's role is to produce realistic images from random noise, while the discriminator's role is to distinguish between real images from the dataset and fake images generated by the generator. By training these two networks adversarially, we aim to produce a generator capable of creating highly realistic facial images.
4.2.1 Building the Generator
The generator is a neural network that takes random noise as input and transforms it into a realistic image. For our face generation project, we'll use a deep convolutional generator. The architecture will include several layers of transposed convolutions, batch normalization, and activation functions to progressively upsample the input noise into a full-sized image.
Key Components:
- Dense Layer: The first layer will be a dense layer that takes the input noise and projects it into a higher-dimensional space.
- Reshape Layer: This layer reshapes the output of the dense layer into a 3D tensor suitable for convolutional operations.
- Transposed Convolutional Layers: These layers (also known as deconvolutional layers) will upsample the tensor to the desired image size.
- Batch Normalization: Batch normalization will be applied after each transposed convolution to stabilize and accelerate the training process.
- Activation Functions: We'll use LeakyReLU activations in hidden layers and Tanh activation in the output layer to ensure the pixel values are in the range [-1, 1].
Example: Generator Code
import tensorflow as tf
from tensorflow.keras.layers import Dense, Reshape, BatchNormalization, LeakyReLU, Conv2DTranspose
from tensorflow.keras.models import Sequential
def build_generator(latent_dim):
model = Sequential()
# Dense layer
model.add(Dense(256 * 8 * 8, activation="relu", input_dim=latent_dim))
model.add(Reshape((8, 8, 256)))
# Transposed convolutional layers
model.add(Conv2DTranspose(128, kernel_size=4, strides=2, padding='same'))
model.add(BatchNormalization())
model.add(LeakyReLU(alpha=0.2))
model.add(Conv2DTranspose(64, kernel_size=4, strides=2, padding='same'))
model.add(BatchNormalization())
model.add(LeakyReLU(alpha=0.2))
model.add(Conv2DTranspose(32, kernel_size=4, strides=2, padding='same'))
model.add(BatchNormalization())
model.add(LeakyReLU(alpha=0.2))
# Output layer
model.add(Conv2DTranspose(3, kernel_size=4, strides=2, padding='same', activation='tanh'))
return model
# Define the latent dimension (size of the random noise vector)
latent_dim = 100
generator = build_generator(latent_dim)
generator.summary()
The model is built using the Sequential API, starting with a dense layer that takes the latent_dim (size of the random noise vector) as input. The output is reshaped into a 8x8x256 tensor.
It then applies multiple layers of transposed convolution (Conv2DTranspose), each followed by batch normalization and a LeakyReLU activation. These layers progressively upsample the tensor to larger spatial dimensions.
The final Conv2DTranspose layer outputs a tensor with shape corresponding to an image, with 3 color channels (RGB), and tanh activation.
At the end, the generator's structure is printed using the summary() method.
4.2.2 Building the Discriminator
The discriminator is a neural network that takes an image as input and outputs a probability indicating whether the image is real (from the dataset) or fake (generated by the generator). For our project, we'll use a deep convolutional discriminator with several convolutional layers, batch normalization, and activation functions.
Key Components:
- Convolutional Layers: These layers will downsample the input image, extracting hierarchical features at different levels of abstraction.
- Batch Normalization: Applied after each convolutional layer to stabilize and accelerate training.
- Activation Functions: We'll use LeakyReLU activations in hidden layers to allow small negative gradients and a sigmoid activation in the output layer to produce a probability.
Example: Discriminator Code
import tensorflow as tf
from tensorflow.keras.layers import Conv2D, Flatten, Dense, LeakyReLU, BatchNormalization
from tensorflow.keras.models import Sequential
def build_discriminator(img_shape):
model = Sequential()
# Convolutional layers
model.add(Conv2D(64, kernel_size=4, strides=2, padding='same', input_shape=img_shape))
model.add(LeakyReLU(alpha=0.2))
model.add(Conv2D(128, kernel_size=4, strides=2, padding='same'))
model.add(BatchNormalization())
model.add(LeakyReLU(alpha=0.2))
model.add(Conv2D(256, kernel_size=4, strides=2, padding='same'))
model.add(BatchNormalization())
model.add(LeakyReLU(alpha=0.2))
model.add(Conv2D(512, kernel_size=4, strides=2, padding='same'))
model.add(BatchNormalization())
model.add(LeakyReLU(alpha=0.2))
# Output layer
model.add(Flatten())
model.add(Dense(1, activation='sigmoid'))
return model
# Define the image shape (e.g., 64x64 RGB images)
img_shape = (64, 64, 3)
discriminator = build_discriminator(img_shape)
discriminator.summary()
This example code defines a function that builds a discriminator model for a Generative Adversarial Network (GAN) using the TensorFlow and Keras libraries. The model has a series of convolutional layers, each followed by a LeakyReLU activation function and some include Batch Normalization.
These layers are used to extract features from the input images. The output layer is a dense layer with a single unit and a sigmoid activation function, which will output the probability of the input image being real. The model is then built with a specified image shape and the structure of the model is displayed.
4.2.3 Compiling the Models
Before training the GAN, we need to compile the discriminator and the combined GAN model. The discriminator will be compiled separately with a binary cross-entropy loss function and an optimizer. The combined GAN model, which includes the generator and the discriminator, will also be compiled with a binary cross-entropy loss function and an optimizer.
Compiling the Discriminator:
# Compile the discriminator
discriminator.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
The model will be trained to minimize a function called 'binary cross entropy' (commonly used in binary classification problems) and the accuracy of the model will be tracked as a metric during the training process.
Compiling the Combined GAN Model:
For the combined GAN model, we first need to freeze the discriminator's weights to ensure only the generator is trained during the combined model's training phase. The combined model takes noise as input, generates an image, and then evaluates the generated image using the discriminator.
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input
# Freeze the discriminator's weights during the combined model training
discriminator.trainable = False
# Create the combined GAN model
gan_input = Input(shape=(latent_dim,))
generated_img = generator(gan_input)
gan_output = discriminator(generated_img)
gan = Model(gan_input, gan_output)
gan.compile(optimizer='adam', loss='binary_crossentropy')
# Summary of the combined GAN model
gan.summary()
Here, the Discriminator's weights are frozen to prevent it from training during the combined model's training. The Generator takes an input (the latent dimension) and generates an image. This generated image is then passed to the Discriminator which classifies it as real or fake.
The combined GAN model is compiled with the 'adam' optimizer and 'binary_crossentropy' loss function, which is suitable for a binary classification problem.
Finally, a summary of the combined GAN model is displayed, offering an overview of the model's architecture and parameters.
4.2.4 Visualizing the Model Architectures
Visualizing the architectures of both the generator and discriminator can provide insights into their structures and help identify any potential issues.
Visualizing the Generator:
from tensorflow.keras.utils import plot_model
plot_model(generator, to_file='generator_model.png', show_shapes=True, show_layer_names=True)
The 'plot_model' function is used to create this visualization, which will be saved as 'generator_model.png'. The 'show_shapes' parameter is set to 'True' to display the dimensions of the model's layers, and 'show_layer_names' is set to 'True' to display the names of each layer in the model.
Visualizing the Discriminator:
plot_model(discriminator, to_file='discriminator_model.png', show_shapes=True, show_layer_names=True)
This line of code uses the plot_model
function from the Keras library to create a visualization of the structure of the 'discriminator' model. It saves this visualization as a .png file named 'discriminator_model.png'. The parameters 'show_shapes=True' and 'show_layer_names=True' indicate that the visualization should include the shapes of the model's layers and the names of each layer.
By successfully creating and compiling the generator, discriminator, and the combined GAN model, we have laid the groundwork for training our GAN. The next steps involve training the GAN on the CelebA dataset, monitoring its performance, and evaluating the quality of the generated images.
4.2 Model Creation
Creating a GAN model involves designing both the generator and the discriminator. The generator's role is to produce realistic images from random noise, while the discriminator's role is to distinguish between real images from the dataset and fake images generated by the generator. By training these two networks adversarially, we aim to produce a generator capable of creating highly realistic facial images.
4.2.1 Building the Generator
The generator is a neural network that takes random noise as input and transforms it into a realistic image. For our face generation project, we'll use a deep convolutional generator. The architecture will include several layers of transposed convolutions, batch normalization, and activation functions to progressively upsample the input noise into a full-sized image.
Key Components:
- Dense Layer: The first layer will be a dense layer that takes the input noise and projects it into a higher-dimensional space.
- Reshape Layer: This layer reshapes the output of the dense layer into a 3D tensor suitable for convolutional operations.
- Transposed Convolutional Layers: These layers (also known as deconvolutional layers) will upsample the tensor to the desired image size.
- Batch Normalization: Batch normalization will be applied after each transposed convolution to stabilize and accelerate the training process.
- Activation Functions: We'll use LeakyReLU activations in hidden layers and Tanh activation in the output layer to ensure the pixel values are in the range [-1, 1].
Example: Generator Code
import tensorflow as tf
from tensorflow.keras.layers import Dense, Reshape, BatchNormalization, LeakyReLU, Conv2DTranspose
from tensorflow.keras.models import Sequential
def build_generator(latent_dim):
model = Sequential()
# Dense layer
model.add(Dense(256 * 8 * 8, activation="relu", input_dim=latent_dim))
model.add(Reshape((8, 8, 256)))
# Transposed convolutional layers
model.add(Conv2DTranspose(128, kernel_size=4, strides=2, padding='same'))
model.add(BatchNormalization())
model.add(LeakyReLU(alpha=0.2))
model.add(Conv2DTranspose(64, kernel_size=4, strides=2, padding='same'))
model.add(BatchNormalization())
model.add(LeakyReLU(alpha=0.2))
model.add(Conv2DTranspose(32, kernel_size=4, strides=2, padding='same'))
model.add(BatchNormalization())
model.add(LeakyReLU(alpha=0.2))
# Output layer
model.add(Conv2DTranspose(3, kernel_size=4, strides=2, padding='same', activation='tanh'))
return model
# Define the latent dimension (size of the random noise vector)
latent_dim = 100
generator = build_generator(latent_dim)
generator.summary()
The model is built using the Sequential API, starting with a dense layer that takes the latent_dim (size of the random noise vector) as input. The output is reshaped into a 8x8x256 tensor.
It then applies multiple layers of transposed convolution (Conv2DTranspose), each followed by batch normalization and a LeakyReLU activation. These layers progressively upsample the tensor to larger spatial dimensions.
The final Conv2DTranspose layer outputs a tensor with shape corresponding to an image, with 3 color channels (RGB), and tanh activation.
At the end, the generator's structure is printed using the summary() method.
4.2.2 Building the Discriminator
The discriminator is a neural network that takes an image as input and outputs a probability indicating whether the image is real (from the dataset) or fake (generated by the generator). For our project, we'll use a deep convolutional discriminator with several convolutional layers, batch normalization, and activation functions.
Key Components:
- Convolutional Layers: These layers will downsample the input image, extracting hierarchical features at different levels of abstraction.
- Batch Normalization: Applied after each convolutional layer to stabilize and accelerate training.
- Activation Functions: We'll use LeakyReLU activations in hidden layers to allow small negative gradients and a sigmoid activation in the output layer to produce a probability.
Example: Discriminator Code
import tensorflow as tf
from tensorflow.keras.layers import Conv2D, Flatten, Dense, LeakyReLU, BatchNormalization
from tensorflow.keras.models import Sequential
def build_discriminator(img_shape):
model = Sequential()
# Convolutional layers
model.add(Conv2D(64, kernel_size=4, strides=2, padding='same', input_shape=img_shape))
model.add(LeakyReLU(alpha=0.2))
model.add(Conv2D(128, kernel_size=4, strides=2, padding='same'))
model.add(BatchNormalization())
model.add(LeakyReLU(alpha=0.2))
model.add(Conv2D(256, kernel_size=4, strides=2, padding='same'))
model.add(BatchNormalization())
model.add(LeakyReLU(alpha=0.2))
model.add(Conv2D(512, kernel_size=4, strides=2, padding='same'))
model.add(BatchNormalization())
model.add(LeakyReLU(alpha=0.2))
# Output layer
model.add(Flatten())
model.add(Dense(1, activation='sigmoid'))
return model
# Define the image shape (e.g., 64x64 RGB images)
img_shape = (64, 64, 3)
discriminator = build_discriminator(img_shape)
discriminator.summary()
This example code defines a function that builds a discriminator model for a Generative Adversarial Network (GAN) using the TensorFlow and Keras libraries. The model has a series of convolutional layers, each followed by a LeakyReLU activation function and some include Batch Normalization.
These layers are used to extract features from the input images. The output layer is a dense layer with a single unit and a sigmoid activation function, which will output the probability of the input image being real. The model is then built with a specified image shape and the structure of the model is displayed.
4.2.3 Compiling the Models
Before training the GAN, we need to compile the discriminator and the combined GAN model. The discriminator will be compiled separately with a binary cross-entropy loss function and an optimizer. The combined GAN model, which includes the generator and the discriminator, will also be compiled with a binary cross-entropy loss function and an optimizer.
Compiling the Discriminator:
# Compile the discriminator
discriminator.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
The model will be trained to minimize a function called 'binary cross entropy' (commonly used in binary classification problems) and the accuracy of the model will be tracked as a metric during the training process.
Compiling the Combined GAN Model:
For the combined GAN model, we first need to freeze the discriminator's weights to ensure only the generator is trained during the combined model's training phase. The combined model takes noise as input, generates an image, and then evaluates the generated image using the discriminator.
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input
# Freeze the discriminator's weights during the combined model training
discriminator.trainable = False
# Create the combined GAN model
gan_input = Input(shape=(latent_dim,))
generated_img = generator(gan_input)
gan_output = discriminator(generated_img)
gan = Model(gan_input, gan_output)
gan.compile(optimizer='adam', loss='binary_crossentropy')
# Summary of the combined GAN model
gan.summary()
Here, the Discriminator's weights are frozen to prevent it from training during the combined model's training. The Generator takes an input (the latent dimension) and generates an image. This generated image is then passed to the Discriminator which classifies it as real or fake.
The combined GAN model is compiled with the 'adam' optimizer and 'binary_crossentropy' loss function, which is suitable for a binary classification problem.
Finally, a summary of the combined GAN model is displayed, offering an overview of the model's architecture and parameters.
4.2.4 Visualizing the Model Architectures
Visualizing the architectures of both the generator and discriminator can provide insights into their structures and help identify any potential issues.
Visualizing the Generator:
from tensorflow.keras.utils import plot_model
plot_model(generator, to_file='generator_model.png', show_shapes=True, show_layer_names=True)
The 'plot_model' function is used to create this visualization, which will be saved as 'generator_model.png'. The 'show_shapes' parameter is set to 'True' to display the dimensions of the model's layers, and 'show_layer_names' is set to 'True' to display the names of each layer in the model.
Visualizing the Discriminator:
plot_model(discriminator, to_file='discriminator_model.png', show_shapes=True, show_layer_names=True)
This line of code uses the plot_model
function from the Keras library to create a visualization of the structure of the 'discriminator' model. It saves this visualization as a .png file named 'discriminator_model.png'. The parameters 'show_shapes=True' and 'show_layer_names=True' indicate that the visualization should include the shapes of the model's layers and the names of each layer.
By successfully creating and compiling the generator, discriminator, and the combined GAN model, we have laid the groundwork for training our GAN. The next steps involve training the GAN on the CelebA dataset, monitoring its performance, and evaluating the quality of the generated images.