Code icon

The App is Under a Quick Maintenance

We apologize for the inconvenience. Please come back later

Menu iconMenu iconGenerative Deep Learning Edición Actualizada
Generative Deep Learning Edición Actualizada

Chapter 4: Project Face Generation with GANs

4.1 Data Collection and Preprocessing

In this chapter, we embark on a comprehensive project focused on generating realistic human faces using Generative Adversarial Networks (GANs). This project will guide you through each step of the process, from data collection and preprocessing to model creation, training, and evaluation. By the end of this chapter, you will have hands-on experience in implementing a GAN that can generate high-quality facial images.

Face generation is a compelling application of GANs, demonstrating the power of generative modeling in producing realistic and diverse outputs. This project not only reinforces the theoretical concepts covered in previous chapters but also provides practical insights into tackling real-world generative modeling tasks.

The first step in our face generation project involves collecting and preprocessing the data. High-quality and diverse datasets are crucial for training GANs, as they directly impact the quality and realism of the generated images. For this project, we will use the CelebA dataset, a large-scale face dataset widely used in the research community.

4.1.1 Downloading the CelebA Dataset

The CelebA dataset contains over 200,000 celebrity images with a wide range of facial attributes. It is available for download from various sources, including the official website and academic repositories. Here’s how you can download and prepare the dataset for our project:

  1. Download the Dataset:

    Visit the CelebA dataset page (http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html) and download the aligned and cropped version of the dataset.

  2. Extract the Images:

    Once downloaded, extract the images to a directory on your local machine.

  3. Directory Structure:

    Ensure the images are organized in a directory structure that can be easily accessed for loading and preprocessing.

4.1.2 Preprocessing the Images

Preprocessing is a critical step in preparing the data for training. It involves resizing, normalizing, and augmenting the images to ensure they are suitable for input into the GAN. Let’s walk through the preprocessing steps:

  1. Resizing:

    Resize the images to a consistent size (e.g., 64x64 pixels) to standardize the input dimensions for the GAN.

  2. Normalization:

    Normalize the pixel values to the range [-1, 1], which is a common practice for training GANs.

  3. Augmentation (Optional):

    Apply data augmentation techniques such as horizontal flipping, rotation, and cropping to increase the diversity of the training data.

Example: Preprocessing Code

Here’s a sample code snippet to preprocess the CelebA dataset using TensorFlow:

import tensorflow as tf
import numpy as np
import os
from tensorflow.keras.preprocessing.image import img_to_array, load_img

# Define the path to the dataset directory
dataset_dir = 'path/to/celeba/dataset'

# Define image dimensions
img_height, img_width = 64, 64

# Function to load and preprocess images
def preprocess_image(img_path):
    img = load_img(img_path, target_size=(img_height, img_width))
    img_array = img_to_array(img)
    img_array = (img_array - 127.5) / 127.5  # Normalize to [-1, 1]
    return img_array

# Load and preprocess the dataset
def load_dataset(dataset_dir):
    img_paths = [os.path.join(dataset_dir, fname) for fname in os.listdir(dataset_dir)]
    dataset = np.array([preprocess_image(img_path) for img_path in img_paths])
    return dataset

# Load the dataset
celeba_dataset = load_dataset(dataset_dir)

# Verify the shape and range of the dataset
print(f'Dataset shape: {celeba_dataset.shape}')
print(f'Min pixel value: {celeba_dataset.min()}, Max pixel value: {celeba_dataset.max()}')

This example script uses the TensorFlow and NumPy libraries to load and preprocess a dataset of images. First, it defines the path to the dataset and the dimensions of the images. It then defines a function to preprocess each image: it resizes the image to the set dimensions, converts it to an array, and normalizes its pixel values to be between -1 and 1.

Another function is defined to load the dataset. This function gets the paths of all images in the dataset directory, preprocesses each image using the previously defined function, and stores all preprocessed images in a NumPy array.

Finally, the script loads the dataset using the load function, and prints the shape of the dataset (i.e., its dimensions) and the minimum and maximum pixel values in the dataset. This final step verifies whether the images have been correctly loaded and preprocessed.

4.1.3 Splitting the Dataset

For effective training and evaluation, it’s important to split the dataset into training and validation sets. This allows us to monitor the model’s performance on unseen data and prevent overfitting.

Example: Splitting Code

Here’s how you can split the CelebA dataset:

from sklearn.model_selection import train_test_split

# Split the dataset into training and validation sets
train_images, val_images = train_test_split(celeba_dataset, test_size=0.1, random_state=42)

# Verify the shapes of the splits
print(f'Training set shape: {train_images.shape}')
print(f'Validation set shape: {val_images.shape}')

This example code is using the function 'train_test_split' from the 'sklearn.model_selection' library to divide a dataset named 'celeba_dataset' into two parts: a larger training set and a smaller validation set.

The division is made such that 90% of the data goes to the training set and 10% to the validation set ('test_size=0.1'). The 'random_state=42' ensures that the splits generated are reproducible. After splitting the data, the code prints out the shape (number of samples and features) of both the training and validation sets.

4.1.4 Data Loading and Batching

Efficient data loading and batching are essential for training GANs, especially when dealing with large datasets. TensorFlow’s data API provides convenient utilities for creating data pipelines that efficiently load and batch data during training.

Example: Data Loading and Batching Code

Here’s how you can create a data pipeline using TensorFlow’s data API:

# Create a TensorFlow dataset from the training images
train_dataset = tf.data.Dataset.from_tensor_slices(train_images)

# Define data augmentation function (optional)
def augment_image(image):
    image = tf.image.random_flip_left_right(image)
    return image

# Apply data augmentation and batch the dataset
batch_size = 64
train_dataset = train_dataset.map(augment_image, num_parallel_calls=tf.data.AUTOTUNE)
train_dataset = train_dataset.batch(batch_size)
train_dataset = train_dataset.prefetch(tf.data.AUTOTUNE)

# Verify the shape of a batch
for batch in train_dataset.take(1):
    print(f'Batch shape: {batch.shape}')

This example code is using the TensorFlow to prepare a dataset for training a machine learning model. It first creates a TensorFlow dataset from training images. Then, it defines a function to augment the images by randomly flipping them left or right. This technique is used to artificially increase the size and diversity of the training dataset.

The code then applies this function to the dataset, batches the images in groups of 64 for more efficient training, and prefetches the batches to reduce training time. Finally, it prints the shape of a batch to verify the transformation.

By following these steps, you will have a well-prepared dataset ready for training a GAN to generate realistic human faces. In the next sections, we will move on to creating and training the GAN model, evaluating its performance, and generating new faces.

4.1 Data Collection and Preprocessing

In this chapter, we embark on a comprehensive project focused on generating realistic human faces using Generative Adversarial Networks (GANs). This project will guide you through each step of the process, from data collection and preprocessing to model creation, training, and evaluation. By the end of this chapter, you will have hands-on experience in implementing a GAN that can generate high-quality facial images.

Face generation is a compelling application of GANs, demonstrating the power of generative modeling in producing realistic and diverse outputs. This project not only reinforces the theoretical concepts covered in previous chapters but also provides practical insights into tackling real-world generative modeling tasks.

The first step in our face generation project involves collecting and preprocessing the data. High-quality and diverse datasets are crucial for training GANs, as they directly impact the quality and realism of the generated images. For this project, we will use the CelebA dataset, a large-scale face dataset widely used in the research community.

4.1.1 Downloading the CelebA Dataset

The CelebA dataset contains over 200,000 celebrity images with a wide range of facial attributes. It is available for download from various sources, including the official website and academic repositories. Here’s how you can download and prepare the dataset for our project:

  1. Download the Dataset:

    Visit the CelebA dataset page (http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html) and download the aligned and cropped version of the dataset.

  2. Extract the Images:

    Once downloaded, extract the images to a directory on your local machine.

  3. Directory Structure:

    Ensure the images are organized in a directory structure that can be easily accessed for loading and preprocessing.

4.1.2 Preprocessing the Images

Preprocessing is a critical step in preparing the data for training. It involves resizing, normalizing, and augmenting the images to ensure they are suitable for input into the GAN. Let’s walk through the preprocessing steps:

  1. Resizing:

    Resize the images to a consistent size (e.g., 64x64 pixels) to standardize the input dimensions for the GAN.

  2. Normalization:

    Normalize the pixel values to the range [-1, 1], which is a common practice for training GANs.

  3. Augmentation (Optional):

    Apply data augmentation techniques such as horizontal flipping, rotation, and cropping to increase the diversity of the training data.

Example: Preprocessing Code

Here’s a sample code snippet to preprocess the CelebA dataset using TensorFlow:

import tensorflow as tf
import numpy as np
import os
from tensorflow.keras.preprocessing.image import img_to_array, load_img

# Define the path to the dataset directory
dataset_dir = 'path/to/celeba/dataset'

# Define image dimensions
img_height, img_width = 64, 64

# Function to load and preprocess images
def preprocess_image(img_path):
    img = load_img(img_path, target_size=(img_height, img_width))
    img_array = img_to_array(img)
    img_array = (img_array - 127.5) / 127.5  # Normalize to [-1, 1]
    return img_array

# Load and preprocess the dataset
def load_dataset(dataset_dir):
    img_paths = [os.path.join(dataset_dir, fname) for fname in os.listdir(dataset_dir)]
    dataset = np.array([preprocess_image(img_path) for img_path in img_paths])
    return dataset

# Load the dataset
celeba_dataset = load_dataset(dataset_dir)

# Verify the shape and range of the dataset
print(f'Dataset shape: {celeba_dataset.shape}')
print(f'Min pixel value: {celeba_dataset.min()}, Max pixel value: {celeba_dataset.max()}')

This example script uses the TensorFlow and NumPy libraries to load and preprocess a dataset of images. First, it defines the path to the dataset and the dimensions of the images. It then defines a function to preprocess each image: it resizes the image to the set dimensions, converts it to an array, and normalizes its pixel values to be between -1 and 1.

Another function is defined to load the dataset. This function gets the paths of all images in the dataset directory, preprocesses each image using the previously defined function, and stores all preprocessed images in a NumPy array.

Finally, the script loads the dataset using the load function, and prints the shape of the dataset (i.e., its dimensions) and the minimum and maximum pixel values in the dataset. This final step verifies whether the images have been correctly loaded and preprocessed.

4.1.3 Splitting the Dataset

For effective training and evaluation, it’s important to split the dataset into training and validation sets. This allows us to monitor the model’s performance on unseen data and prevent overfitting.

Example: Splitting Code

Here’s how you can split the CelebA dataset:

from sklearn.model_selection import train_test_split

# Split the dataset into training and validation sets
train_images, val_images = train_test_split(celeba_dataset, test_size=0.1, random_state=42)

# Verify the shapes of the splits
print(f'Training set shape: {train_images.shape}')
print(f'Validation set shape: {val_images.shape}')

This example code is using the function 'train_test_split' from the 'sklearn.model_selection' library to divide a dataset named 'celeba_dataset' into two parts: a larger training set and a smaller validation set.

The division is made such that 90% of the data goes to the training set and 10% to the validation set ('test_size=0.1'). The 'random_state=42' ensures that the splits generated are reproducible. After splitting the data, the code prints out the shape (number of samples and features) of both the training and validation sets.

4.1.4 Data Loading and Batching

Efficient data loading and batching are essential for training GANs, especially when dealing with large datasets. TensorFlow’s data API provides convenient utilities for creating data pipelines that efficiently load and batch data during training.

Example: Data Loading and Batching Code

Here’s how you can create a data pipeline using TensorFlow’s data API:

# Create a TensorFlow dataset from the training images
train_dataset = tf.data.Dataset.from_tensor_slices(train_images)

# Define data augmentation function (optional)
def augment_image(image):
    image = tf.image.random_flip_left_right(image)
    return image

# Apply data augmentation and batch the dataset
batch_size = 64
train_dataset = train_dataset.map(augment_image, num_parallel_calls=tf.data.AUTOTUNE)
train_dataset = train_dataset.batch(batch_size)
train_dataset = train_dataset.prefetch(tf.data.AUTOTUNE)

# Verify the shape of a batch
for batch in train_dataset.take(1):
    print(f'Batch shape: {batch.shape}')

This example code is using the TensorFlow to prepare a dataset for training a machine learning model. It first creates a TensorFlow dataset from training images. Then, it defines a function to augment the images by randomly flipping them left or right. This technique is used to artificially increase the size and diversity of the training dataset.

The code then applies this function to the dataset, batches the images in groups of 64 for more efficient training, and prefetches the batches to reduce training time. Finally, it prints the shape of a batch to verify the transformation.

By following these steps, you will have a well-prepared dataset ready for training a GAN to generate realistic human faces. In the next sections, we will move on to creating and training the GAN model, evaluating its performance, and generating new faces.

4.1 Data Collection and Preprocessing

In this chapter, we embark on a comprehensive project focused on generating realistic human faces using Generative Adversarial Networks (GANs). This project will guide you through each step of the process, from data collection and preprocessing to model creation, training, and evaluation. By the end of this chapter, you will have hands-on experience in implementing a GAN that can generate high-quality facial images.

Face generation is a compelling application of GANs, demonstrating the power of generative modeling in producing realistic and diverse outputs. This project not only reinforces the theoretical concepts covered in previous chapters but also provides practical insights into tackling real-world generative modeling tasks.

The first step in our face generation project involves collecting and preprocessing the data. High-quality and diverse datasets are crucial for training GANs, as they directly impact the quality and realism of the generated images. For this project, we will use the CelebA dataset, a large-scale face dataset widely used in the research community.

4.1.1 Downloading the CelebA Dataset

The CelebA dataset contains over 200,000 celebrity images with a wide range of facial attributes. It is available for download from various sources, including the official website and academic repositories. Here’s how you can download and prepare the dataset for our project:

  1. Download the Dataset:

    Visit the CelebA dataset page (http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html) and download the aligned and cropped version of the dataset.

  2. Extract the Images:

    Once downloaded, extract the images to a directory on your local machine.

  3. Directory Structure:

    Ensure the images are organized in a directory structure that can be easily accessed for loading and preprocessing.

4.1.2 Preprocessing the Images

Preprocessing is a critical step in preparing the data for training. It involves resizing, normalizing, and augmenting the images to ensure they are suitable for input into the GAN. Let’s walk through the preprocessing steps:

  1. Resizing:

    Resize the images to a consistent size (e.g., 64x64 pixels) to standardize the input dimensions for the GAN.

  2. Normalization:

    Normalize the pixel values to the range [-1, 1], which is a common practice for training GANs.

  3. Augmentation (Optional):

    Apply data augmentation techniques such as horizontal flipping, rotation, and cropping to increase the diversity of the training data.

Example: Preprocessing Code

Here’s a sample code snippet to preprocess the CelebA dataset using TensorFlow:

import tensorflow as tf
import numpy as np
import os
from tensorflow.keras.preprocessing.image import img_to_array, load_img

# Define the path to the dataset directory
dataset_dir = 'path/to/celeba/dataset'

# Define image dimensions
img_height, img_width = 64, 64

# Function to load and preprocess images
def preprocess_image(img_path):
    img = load_img(img_path, target_size=(img_height, img_width))
    img_array = img_to_array(img)
    img_array = (img_array - 127.5) / 127.5  # Normalize to [-1, 1]
    return img_array

# Load and preprocess the dataset
def load_dataset(dataset_dir):
    img_paths = [os.path.join(dataset_dir, fname) for fname in os.listdir(dataset_dir)]
    dataset = np.array([preprocess_image(img_path) for img_path in img_paths])
    return dataset

# Load the dataset
celeba_dataset = load_dataset(dataset_dir)

# Verify the shape and range of the dataset
print(f'Dataset shape: {celeba_dataset.shape}')
print(f'Min pixel value: {celeba_dataset.min()}, Max pixel value: {celeba_dataset.max()}')

This example script uses the TensorFlow and NumPy libraries to load and preprocess a dataset of images. First, it defines the path to the dataset and the dimensions of the images. It then defines a function to preprocess each image: it resizes the image to the set dimensions, converts it to an array, and normalizes its pixel values to be between -1 and 1.

Another function is defined to load the dataset. This function gets the paths of all images in the dataset directory, preprocesses each image using the previously defined function, and stores all preprocessed images in a NumPy array.

Finally, the script loads the dataset using the load function, and prints the shape of the dataset (i.e., its dimensions) and the minimum and maximum pixel values in the dataset. This final step verifies whether the images have been correctly loaded and preprocessed.

4.1.3 Splitting the Dataset

For effective training and evaluation, it’s important to split the dataset into training and validation sets. This allows us to monitor the model’s performance on unseen data and prevent overfitting.

Example: Splitting Code

Here’s how you can split the CelebA dataset:

from sklearn.model_selection import train_test_split

# Split the dataset into training and validation sets
train_images, val_images = train_test_split(celeba_dataset, test_size=0.1, random_state=42)

# Verify the shapes of the splits
print(f'Training set shape: {train_images.shape}')
print(f'Validation set shape: {val_images.shape}')

This example code is using the function 'train_test_split' from the 'sklearn.model_selection' library to divide a dataset named 'celeba_dataset' into two parts: a larger training set and a smaller validation set.

The division is made such that 90% of the data goes to the training set and 10% to the validation set ('test_size=0.1'). The 'random_state=42' ensures that the splits generated are reproducible. After splitting the data, the code prints out the shape (number of samples and features) of both the training and validation sets.

4.1.4 Data Loading and Batching

Efficient data loading and batching are essential for training GANs, especially when dealing with large datasets. TensorFlow’s data API provides convenient utilities for creating data pipelines that efficiently load and batch data during training.

Example: Data Loading and Batching Code

Here’s how you can create a data pipeline using TensorFlow’s data API:

# Create a TensorFlow dataset from the training images
train_dataset = tf.data.Dataset.from_tensor_slices(train_images)

# Define data augmentation function (optional)
def augment_image(image):
    image = tf.image.random_flip_left_right(image)
    return image

# Apply data augmentation and batch the dataset
batch_size = 64
train_dataset = train_dataset.map(augment_image, num_parallel_calls=tf.data.AUTOTUNE)
train_dataset = train_dataset.batch(batch_size)
train_dataset = train_dataset.prefetch(tf.data.AUTOTUNE)

# Verify the shape of a batch
for batch in train_dataset.take(1):
    print(f'Batch shape: {batch.shape}')

This example code is using the TensorFlow to prepare a dataset for training a machine learning model. It first creates a TensorFlow dataset from training images. Then, it defines a function to augment the images by randomly flipping them left or right. This technique is used to artificially increase the size and diversity of the training dataset.

The code then applies this function to the dataset, batches the images in groups of 64 for more efficient training, and prefetches the batches to reduce training time. Finally, it prints the shape of a batch to verify the transformation.

By following these steps, you will have a well-prepared dataset ready for training a GAN to generate realistic human faces. In the next sections, we will move on to creating and training the GAN model, evaluating its performance, and generating new faces.

4.1 Data Collection and Preprocessing

In this chapter, we embark on a comprehensive project focused on generating realistic human faces using Generative Adversarial Networks (GANs). This project will guide you through each step of the process, from data collection and preprocessing to model creation, training, and evaluation. By the end of this chapter, you will have hands-on experience in implementing a GAN that can generate high-quality facial images.

Face generation is a compelling application of GANs, demonstrating the power of generative modeling in producing realistic and diverse outputs. This project not only reinforces the theoretical concepts covered in previous chapters but also provides practical insights into tackling real-world generative modeling tasks.

The first step in our face generation project involves collecting and preprocessing the data. High-quality and diverse datasets are crucial for training GANs, as they directly impact the quality and realism of the generated images. For this project, we will use the CelebA dataset, a large-scale face dataset widely used in the research community.

4.1.1 Downloading the CelebA Dataset

The CelebA dataset contains over 200,000 celebrity images with a wide range of facial attributes. It is available for download from various sources, including the official website and academic repositories. Here’s how you can download and prepare the dataset for our project:

  1. Download the Dataset:

    Visit the CelebA dataset page (http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html) and download the aligned and cropped version of the dataset.

  2. Extract the Images:

    Once downloaded, extract the images to a directory on your local machine.

  3. Directory Structure:

    Ensure the images are organized in a directory structure that can be easily accessed for loading and preprocessing.

4.1.2 Preprocessing the Images

Preprocessing is a critical step in preparing the data for training. It involves resizing, normalizing, and augmenting the images to ensure they are suitable for input into the GAN. Let’s walk through the preprocessing steps:

  1. Resizing:

    Resize the images to a consistent size (e.g., 64x64 pixels) to standardize the input dimensions for the GAN.

  2. Normalization:

    Normalize the pixel values to the range [-1, 1], which is a common practice for training GANs.

  3. Augmentation (Optional):

    Apply data augmentation techniques such as horizontal flipping, rotation, and cropping to increase the diversity of the training data.

Example: Preprocessing Code

Here’s a sample code snippet to preprocess the CelebA dataset using TensorFlow:

import tensorflow as tf
import numpy as np
import os
from tensorflow.keras.preprocessing.image import img_to_array, load_img

# Define the path to the dataset directory
dataset_dir = 'path/to/celeba/dataset'

# Define image dimensions
img_height, img_width = 64, 64

# Function to load and preprocess images
def preprocess_image(img_path):
    img = load_img(img_path, target_size=(img_height, img_width))
    img_array = img_to_array(img)
    img_array = (img_array - 127.5) / 127.5  # Normalize to [-1, 1]
    return img_array

# Load and preprocess the dataset
def load_dataset(dataset_dir):
    img_paths = [os.path.join(dataset_dir, fname) for fname in os.listdir(dataset_dir)]
    dataset = np.array([preprocess_image(img_path) for img_path in img_paths])
    return dataset

# Load the dataset
celeba_dataset = load_dataset(dataset_dir)

# Verify the shape and range of the dataset
print(f'Dataset shape: {celeba_dataset.shape}')
print(f'Min pixel value: {celeba_dataset.min()}, Max pixel value: {celeba_dataset.max()}')

This example script uses the TensorFlow and NumPy libraries to load and preprocess a dataset of images. First, it defines the path to the dataset and the dimensions of the images. It then defines a function to preprocess each image: it resizes the image to the set dimensions, converts it to an array, and normalizes its pixel values to be between -1 and 1.

Another function is defined to load the dataset. This function gets the paths of all images in the dataset directory, preprocesses each image using the previously defined function, and stores all preprocessed images in a NumPy array.

Finally, the script loads the dataset using the load function, and prints the shape of the dataset (i.e., its dimensions) and the minimum and maximum pixel values in the dataset. This final step verifies whether the images have been correctly loaded and preprocessed.

4.1.3 Splitting the Dataset

For effective training and evaluation, it’s important to split the dataset into training and validation sets. This allows us to monitor the model’s performance on unseen data and prevent overfitting.

Example: Splitting Code

Here’s how you can split the CelebA dataset:

from sklearn.model_selection import train_test_split

# Split the dataset into training and validation sets
train_images, val_images = train_test_split(celeba_dataset, test_size=0.1, random_state=42)

# Verify the shapes of the splits
print(f'Training set shape: {train_images.shape}')
print(f'Validation set shape: {val_images.shape}')

This example code is using the function 'train_test_split' from the 'sklearn.model_selection' library to divide a dataset named 'celeba_dataset' into two parts: a larger training set and a smaller validation set.

The division is made such that 90% of the data goes to the training set and 10% to the validation set ('test_size=0.1'). The 'random_state=42' ensures that the splits generated are reproducible. After splitting the data, the code prints out the shape (number of samples and features) of both the training and validation sets.

4.1.4 Data Loading and Batching

Efficient data loading and batching are essential for training GANs, especially when dealing with large datasets. TensorFlow’s data API provides convenient utilities for creating data pipelines that efficiently load and batch data during training.

Example: Data Loading and Batching Code

Here’s how you can create a data pipeline using TensorFlow’s data API:

# Create a TensorFlow dataset from the training images
train_dataset = tf.data.Dataset.from_tensor_slices(train_images)

# Define data augmentation function (optional)
def augment_image(image):
    image = tf.image.random_flip_left_right(image)
    return image

# Apply data augmentation and batch the dataset
batch_size = 64
train_dataset = train_dataset.map(augment_image, num_parallel_calls=tf.data.AUTOTUNE)
train_dataset = train_dataset.batch(batch_size)
train_dataset = train_dataset.prefetch(tf.data.AUTOTUNE)

# Verify the shape of a batch
for batch in train_dataset.take(1):
    print(f'Batch shape: {batch.shape}')

This example code is using the TensorFlow to prepare a dataset for training a machine learning model. It first creates a TensorFlow dataset from training images. Then, it defines a function to augment the images by randomly flipping them left or right. This technique is used to artificially increase the size and diversity of the training dataset.

The code then applies this function to the dataset, batches the images in groups of 64 for more efficient training, and prefetches the batches to reduce training time. Finally, it prints the shape of a batch to verify the transformation.

By following these steps, you will have a well-prepared dataset ready for training a GAN to generate realistic human faces. In the next sections, we will move on to creating and training the GAN model, evaluating its performance, and generating new faces.