Chapter 8: Project: Text Generation with Autoregressive Models
8.7 Complete Code
Up until now, we have been examining each part of the project individually, walking through every step of the process. This has allowed us to deeply understand the nuances and technicalities of each part of the pipeline - from data collection to evaluation. However, once we understand these individual components, it's often beneficial to see everything in one place. It helps to understand the flow of the entire project, and it also provides a complete, runnable code that can be adapted to similar tasks.
So in this section, we are going to consolidate all the steps we've taken into a single script. This complete code for the project will provide a comprehensive overview of all the processes involved in generating text with an autoregressive model, from start to finish. This can serve as a template for your future endeavors in text generation. Please note, while this code will run as is, you may need to adjust parameters or steps to suit your specific needs or to optimize performance.
Let's dive in!
import tensorflow as tf
from tensorflow.keras.layers.experimental import preprocessing
import numpy as np
import os
# Load Dataset
path_to_file = tf.keras.utils.get_file('shakespeare.txt', 'https://storage.googleapis.com/download.tensorflow.org/data/shakespeare.txt')
text = open(path_to_file, 'rb').read().decode(encoding='utf-8')
# Unique Characters
vocab = sorted(set(text))
char2idx = {u: i for i, u in enumerate(vocab)}
idx2char = np.array(vocab)
# Text Processing
ids_from_chars = preprocessing.StringLookup(vocabulary=list(vocab), mask_token=None)
chars_from_ids = tf.keras.layers.experimental.preprocessing.StringLookup(vocabulary=ids_from_chars.get_vocabulary(), invert=True, mask_token=None)
all_ids = ids_from_chars(tf.strings.unicode_split(text, 'UTF-8'))
# Dataset Creation
ids_dataset = tf.data.Dataset.from_tensor_slices(all_ids)
seq_length = 100
sequences = ids_dataset.batch(seq_length+1, drop_remainder=True)
# Split Input and Target Text
def split_input_target(sequence):
input_text = sequence[:-1]
target_text = sequence[1:]
return input_text, target_text
dataset = sequences.map(split_input_target)
BATCH_SIZE = 64
BUFFER_SIZE = 10000
dataset = dataset.shuffle(BUFFER_SIZE).batch(BATCH_SIZE, drop_remainder=True)
# Building the Model
def build_model(vocab_size, embedding_dim, rnn_units, batch_size):
model = tf.keras.Sequential([
tf.keras.layers.Embedding(vocab_size, embedding_dim, batch_input_shape=[batch_size, None]),
tf.keras.layers.GRU(rnn_units, return_sequences=True, stateful=True, recurrent_initializer='glorot_uniform'),
tf.keras.layers.Dense(vocab_size)
])
return model
vocab_size = len(vocab)
embedding_dim = 256
rnn_units = 1024
model = build_model(vocab_size, embedding_dim, rnn_units, BATCH_SIZE)
# Loss Function
def loss(labels, logits):
return tf.keras.losses.sparse_categorical_crossentropy(labels, logits, from_logits=True)
# Compile Model
model.compile(optimizer='adam', loss=loss)
# Configure Checkpoints
checkpoint_dir = './training_checkpoints'
checkpoint_prefix = os.path.join(checkpoint_dir, "ckpt_{epoch}")
checkpoint_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_prefix, save_weights_only=True)
# Train Model
EPOCHS = 10
history = model.fit(dataset, epochs=EPOCHS, callbacks=[checkpoint_callback])
# Load Model
model = build_model(vocab_size, embedding_dim, rnn_units, batch_size=1)
model.load_weights(tf.train.latest_checkpoint(checkpoint_dir))
model.build(tf.TensorShape([1, None]))
# Text Generation Function
def generate_text(model, start_string):
num_generate = 1000
input_eval = [char2idx[s] for s in start_string]
input_eval = tf.expand_dims(input_eval, 0)
text_generated = []
temperature = 1.0
model.reset_states()
for i in range(num_generate):
predictions = model(input_eval)
predictions = tf.squeeze(predictions, 0)
predictions = predictions / temperature
predicted_id = tf.random.categorical(predictions, num_samples=1)[-1,0].numpy()
input_eval = tf.expand_dims([predicted_id], 0)
text_generated.append(idx2char[predicted_id])
return (start_string + ''.join(text_generated))
# Evaluate the Model
start_string = "To be or not to be"
print(generate_text(model, start_string=start_string))
With this complete code at hand, you can easily put all the steps together in one script and run it from beginning to end. You can make modifications at various stages to suit your dataset and requirements. Do remember to adjust the hyperparameters as necessary to ensure optimal performance.
The full project code demonstrates the power of autoregressive models for the task of text generation. Whether you're aiming to generate creative content like poetry, continue a piece of text in a specific style, or simply experiment with what's possible, this project should serve as a great foundation.
Chapter 8 Conclusion
In this chapter, we embarked on a journey into the world of autoregressive models for text generation. We started by gathering and preprocessing our data, ensuring that our model had the right 'fuel' to learn from. From there, we moved onto building our autoregressive model, which required us to take into account the intricacies of such models, including the design of their architecture and the use of the Transformer API.
Next, we undertook the task of training the model, where we delved into the nuances of training an autoregressive model. Once our model was well-trained, we used it to generate new text, a fascinating process that truly showcased the power of these models. We evaluated the generated text, utilizing a variety of techniques to ensure that our model was performing well and as expected. We then explored some practical considerations for working with autoregressive models.
To round off this chapter, we provided the full project code in one single script. This complete code brings together all the steps involved in generating text with an autoregressive model. The consolidation of the entire process into a single script will aid in comprehension and provides a useful resource for any future work in this area.
Through this project, we have seen the immense power and potential of autoregressive models in the field of text generation. As with any machine learning model, understanding its strengths, limitations, and potential applications is crucial to effectively applying it. With the knowledge gained in this chapter, you're now well-equipped to utilize autoregressive models in your own projects and explorations. Happy coding!
8.7 Complete Code
Up until now, we have been examining each part of the project individually, walking through every step of the process. This has allowed us to deeply understand the nuances and technicalities of each part of the pipeline - from data collection to evaluation. However, once we understand these individual components, it's often beneficial to see everything in one place. It helps to understand the flow of the entire project, and it also provides a complete, runnable code that can be adapted to similar tasks.
So in this section, we are going to consolidate all the steps we've taken into a single script. This complete code for the project will provide a comprehensive overview of all the processes involved in generating text with an autoregressive model, from start to finish. This can serve as a template for your future endeavors in text generation. Please note, while this code will run as is, you may need to adjust parameters or steps to suit your specific needs or to optimize performance.
Let's dive in!
import tensorflow as tf
from tensorflow.keras.layers.experimental import preprocessing
import numpy as np
import os
# Load Dataset
path_to_file = tf.keras.utils.get_file('shakespeare.txt', 'https://storage.googleapis.com/download.tensorflow.org/data/shakespeare.txt')
text = open(path_to_file, 'rb').read().decode(encoding='utf-8')
# Unique Characters
vocab = sorted(set(text))
char2idx = {u: i for i, u in enumerate(vocab)}
idx2char = np.array(vocab)
# Text Processing
ids_from_chars = preprocessing.StringLookup(vocabulary=list(vocab), mask_token=None)
chars_from_ids = tf.keras.layers.experimental.preprocessing.StringLookup(vocabulary=ids_from_chars.get_vocabulary(), invert=True, mask_token=None)
all_ids = ids_from_chars(tf.strings.unicode_split(text, 'UTF-8'))
# Dataset Creation
ids_dataset = tf.data.Dataset.from_tensor_slices(all_ids)
seq_length = 100
sequences = ids_dataset.batch(seq_length+1, drop_remainder=True)
# Split Input and Target Text
def split_input_target(sequence):
input_text = sequence[:-1]
target_text = sequence[1:]
return input_text, target_text
dataset = sequences.map(split_input_target)
BATCH_SIZE = 64
BUFFER_SIZE = 10000
dataset = dataset.shuffle(BUFFER_SIZE).batch(BATCH_SIZE, drop_remainder=True)
# Building the Model
def build_model(vocab_size, embedding_dim, rnn_units, batch_size):
model = tf.keras.Sequential([
tf.keras.layers.Embedding(vocab_size, embedding_dim, batch_input_shape=[batch_size, None]),
tf.keras.layers.GRU(rnn_units, return_sequences=True, stateful=True, recurrent_initializer='glorot_uniform'),
tf.keras.layers.Dense(vocab_size)
])
return model
vocab_size = len(vocab)
embedding_dim = 256
rnn_units = 1024
model = build_model(vocab_size, embedding_dim, rnn_units, BATCH_SIZE)
# Loss Function
def loss(labels, logits):
return tf.keras.losses.sparse_categorical_crossentropy(labels, logits, from_logits=True)
# Compile Model
model.compile(optimizer='adam', loss=loss)
# Configure Checkpoints
checkpoint_dir = './training_checkpoints'
checkpoint_prefix = os.path.join(checkpoint_dir, "ckpt_{epoch}")
checkpoint_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_prefix, save_weights_only=True)
# Train Model
EPOCHS = 10
history = model.fit(dataset, epochs=EPOCHS, callbacks=[checkpoint_callback])
# Load Model
model = build_model(vocab_size, embedding_dim, rnn_units, batch_size=1)
model.load_weights(tf.train.latest_checkpoint(checkpoint_dir))
model.build(tf.TensorShape([1, None]))
# Text Generation Function
def generate_text(model, start_string):
num_generate = 1000
input_eval = [char2idx[s] for s in start_string]
input_eval = tf.expand_dims(input_eval, 0)
text_generated = []
temperature = 1.0
model.reset_states()
for i in range(num_generate):
predictions = model(input_eval)
predictions = tf.squeeze(predictions, 0)
predictions = predictions / temperature
predicted_id = tf.random.categorical(predictions, num_samples=1)[-1,0].numpy()
input_eval = tf.expand_dims([predicted_id], 0)
text_generated.append(idx2char[predicted_id])
return (start_string + ''.join(text_generated))
# Evaluate the Model
start_string = "To be or not to be"
print(generate_text(model, start_string=start_string))
With this complete code at hand, you can easily put all the steps together in one script and run it from beginning to end. You can make modifications at various stages to suit your dataset and requirements. Do remember to adjust the hyperparameters as necessary to ensure optimal performance.
The full project code demonstrates the power of autoregressive models for the task of text generation. Whether you're aiming to generate creative content like poetry, continue a piece of text in a specific style, or simply experiment with what's possible, this project should serve as a great foundation.
Chapter 8 Conclusion
In this chapter, we embarked on a journey into the world of autoregressive models for text generation. We started by gathering and preprocessing our data, ensuring that our model had the right 'fuel' to learn from. From there, we moved onto building our autoregressive model, which required us to take into account the intricacies of such models, including the design of their architecture and the use of the Transformer API.
Next, we undertook the task of training the model, where we delved into the nuances of training an autoregressive model. Once our model was well-trained, we used it to generate new text, a fascinating process that truly showcased the power of these models. We evaluated the generated text, utilizing a variety of techniques to ensure that our model was performing well and as expected. We then explored some practical considerations for working with autoregressive models.
To round off this chapter, we provided the full project code in one single script. This complete code brings together all the steps involved in generating text with an autoregressive model. The consolidation of the entire process into a single script will aid in comprehension and provides a useful resource for any future work in this area.
Through this project, we have seen the immense power and potential of autoregressive models in the field of text generation. As with any machine learning model, understanding its strengths, limitations, and potential applications is crucial to effectively applying it. With the knowledge gained in this chapter, you're now well-equipped to utilize autoregressive models in your own projects and explorations. Happy coding!
8.7 Complete Code
Up until now, we have been examining each part of the project individually, walking through every step of the process. This has allowed us to deeply understand the nuances and technicalities of each part of the pipeline - from data collection to evaluation. However, once we understand these individual components, it's often beneficial to see everything in one place. It helps to understand the flow of the entire project, and it also provides a complete, runnable code that can be adapted to similar tasks.
So in this section, we are going to consolidate all the steps we've taken into a single script. This complete code for the project will provide a comprehensive overview of all the processes involved in generating text with an autoregressive model, from start to finish. This can serve as a template for your future endeavors in text generation. Please note, while this code will run as is, you may need to adjust parameters or steps to suit your specific needs or to optimize performance.
Let's dive in!
import tensorflow as tf
from tensorflow.keras.layers.experimental import preprocessing
import numpy as np
import os
# Load Dataset
path_to_file = tf.keras.utils.get_file('shakespeare.txt', 'https://storage.googleapis.com/download.tensorflow.org/data/shakespeare.txt')
text = open(path_to_file, 'rb').read().decode(encoding='utf-8')
# Unique Characters
vocab = sorted(set(text))
char2idx = {u: i for i, u in enumerate(vocab)}
idx2char = np.array(vocab)
# Text Processing
ids_from_chars = preprocessing.StringLookup(vocabulary=list(vocab), mask_token=None)
chars_from_ids = tf.keras.layers.experimental.preprocessing.StringLookup(vocabulary=ids_from_chars.get_vocabulary(), invert=True, mask_token=None)
all_ids = ids_from_chars(tf.strings.unicode_split(text, 'UTF-8'))
# Dataset Creation
ids_dataset = tf.data.Dataset.from_tensor_slices(all_ids)
seq_length = 100
sequences = ids_dataset.batch(seq_length+1, drop_remainder=True)
# Split Input and Target Text
def split_input_target(sequence):
input_text = sequence[:-1]
target_text = sequence[1:]
return input_text, target_text
dataset = sequences.map(split_input_target)
BATCH_SIZE = 64
BUFFER_SIZE = 10000
dataset = dataset.shuffle(BUFFER_SIZE).batch(BATCH_SIZE, drop_remainder=True)
# Building the Model
def build_model(vocab_size, embedding_dim, rnn_units, batch_size):
model = tf.keras.Sequential([
tf.keras.layers.Embedding(vocab_size, embedding_dim, batch_input_shape=[batch_size, None]),
tf.keras.layers.GRU(rnn_units, return_sequences=True, stateful=True, recurrent_initializer='glorot_uniform'),
tf.keras.layers.Dense(vocab_size)
])
return model
vocab_size = len(vocab)
embedding_dim = 256
rnn_units = 1024
model = build_model(vocab_size, embedding_dim, rnn_units, BATCH_SIZE)
# Loss Function
def loss(labels, logits):
return tf.keras.losses.sparse_categorical_crossentropy(labels, logits, from_logits=True)
# Compile Model
model.compile(optimizer='adam', loss=loss)
# Configure Checkpoints
checkpoint_dir = './training_checkpoints'
checkpoint_prefix = os.path.join(checkpoint_dir, "ckpt_{epoch}")
checkpoint_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_prefix, save_weights_only=True)
# Train Model
EPOCHS = 10
history = model.fit(dataset, epochs=EPOCHS, callbacks=[checkpoint_callback])
# Load Model
model = build_model(vocab_size, embedding_dim, rnn_units, batch_size=1)
model.load_weights(tf.train.latest_checkpoint(checkpoint_dir))
model.build(tf.TensorShape([1, None]))
# Text Generation Function
def generate_text(model, start_string):
num_generate = 1000
input_eval = [char2idx[s] for s in start_string]
input_eval = tf.expand_dims(input_eval, 0)
text_generated = []
temperature = 1.0
model.reset_states()
for i in range(num_generate):
predictions = model(input_eval)
predictions = tf.squeeze(predictions, 0)
predictions = predictions / temperature
predicted_id = tf.random.categorical(predictions, num_samples=1)[-1,0].numpy()
input_eval = tf.expand_dims([predicted_id], 0)
text_generated.append(idx2char[predicted_id])
return (start_string + ''.join(text_generated))
# Evaluate the Model
start_string = "To be or not to be"
print(generate_text(model, start_string=start_string))
With this complete code at hand, you can easily put all the steps together in one script and run it from beginning to end. You can make modifications at various stages to suit your dataset and requirements. Do remember to adjust the hyperparameters as necessary to ensure optimal performance.
The full project code demonstrates the power of autoregressive models for the task of text generation. Whether you're aiming to generate creative content like poetry, continue a piece of text in a specific style, or simply experiment with what's possible, this project should serve as a great foundation.
Chapter 8 Conclusion
In this chapter, we embarked on a journey into the world of autoregressive models for text generation. We started by gathering and preprocessing our data, ensuring that our model had the right 'fuel' to learn from. From there, we moved onto building our autoregressive model, which required us to take into account the intricacies of such models, including the design of their architecture and the use of the Transformer API.
Next, we undertook the task of training the model, where we delved into the nuances of training an autoregressive model. Once our model was well-trained, we used it to generate new text, a fascinating process that truly showcased the power of these models. We evaluated the generated text, utilizing a variety of techniques to ensure that our model was performing well and as expected. We then explored some practical considerations for working with autoregressive models.
To round off this chapter, we provided the full project code in one single script. This complete code brings together all the steps involved in generating text with an autoregressive model. The consolidation of the entire process into a single script will aid in comprehension and provides a useful resource for any future work in this area.
Through this project, we have seen the immense power and potential of autoregressive models in the field of text generation. As with any machine learning model, understanding its strengths, limitations, and potential applications is crucial to effectively applying it. With the knowledge gained in this chapter, you're now well-equipped to utilize autoregressive models in your own projects and explorations. Happy coding!
8.7 Complete Code
Up until now, we have been examining each part of the project individually, walking through every step of the process. This has allowed us to deeply understand the nuances and technicalities of each part of the pipeline - from data collection to evaluation. However, once we understand these individual components, it's often beneficial to see everything in one place. It helps to understand the flow of the entire project, and it also provides a complete, runnable code that can be adapted to similar tasks.
So in this section, we are going to consolidate all the steps we've taken into a single script. This complete code for the project will provide a comprehensive overview of all the processes involved in generating text with an autoregressive model, from start to finish. This can serve as a template for your future endeavors in text generation. Please note, while this code will run as is, you may need to adjust parameters or steps to suit your specific needs or to optimize performance.
Let's dive in!
import tensorflow as tf
from tensorflow.keras.layers.experimental import preprocessing
import numpy as np
import os
# Load Dataset
path_to_file = tf.keras.utils.get_file('shakespeare.txt', 'https://storage.googleapis.com/download.tensorflow.org/data/shakespeare.txt')
text = open(path_to_file, 'rb').read().decode(encoding='utf-8')
# Unique Characters
vocab = sorted(set(text))
char2idx = {u: i for i, u in enumerate(vocab)}
idx2char = np.array(vocab)
# Text Processing
ids_from_chars = preprocessing.StringLookup(vocabulary=list(vocab), mask_token=None)
chars_from_ids = tf.keras.layers.experimental.preprocessing.StringLookup(vocabulary=ids_from_chars.get_vocabulary(), invert=True, mask_token=None)
all_ids = ids_from_chars(tf.strings.unicode_split(text, 'UTF-8'))
# Dataset Creation
ids_dataset = tf.data.Dataset.from_tensor_slices(all_ids)
seq_length = 100
sequences = ids_dataset.batch(seq_length+1, drop_remainder=True)
# Split Input and Target Text
def split_input_target(sequence):
input_text = sequence[:-1]
target_text = sequence[1:]
return input_text, target_text
dataset = sequences.map(split_input_target)
BATCH_SIZE = 64
BUFFER_SIZE = 10000
dataset = dataset.shuffle(BUFFER_SIZE).batch(BATCH_SIZE, drop_remainder=True)
# Building the Model
def build_model(vocab_size, embedding_dim, rnn_units, batch_size):
model = tf.keras.Sequential([
tf.keras.layers.Embedding(vocab_size, embedding_dim, batch_input_shape=[batch_size, None]),
tf.keras.layers.GRU(rnn_units, return_sequences=True, stateful=True, recurrent_initializer='glorot_uniform'),
tf.keras.layers.Dense(vocab_size)
])
return model
vocab_size = len(vocab)
embedding_dim = 256
rnn_units = 1024
model = build_model(vocab_size, embedding_dim, rnn_units, BATCH_SIZE)
# Loss Function
def loss(labels, logits):
return tf.keras.losses.sparse_categorical_crossentropy(labels, logits, from_logits=True)
# Compile Model
model.compile(optimizer='adam', loss=loss)
# Configure Checkpoints
checkpoint_dir = './training_checkpoints'
checkpoint_prefix = os.path.join(checkpoint_dir, "ckpt_{epoch}")
checkpoint_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_prefix, save_weights_only=True)
# Train Model
EPOCHS = 10
history = model.fit(dataset, epochs=EPOCHS, callbacks=[checkpoint_callback])
# Load Model
model = build_model(vocab_size, embedding_dim, rnn_units, batch_size=1)
model.load_weights(tf.train.latest_checkpoint(checkpoint_dir))
model.build(tf.TensorShape([1, None]))
# Text Generation Function
def generate_text(model, start_string):
num_generate = 1000
input_eval = [char2idx[s] for s in start_string]
input_eval = tf.expand_dims(input_eval, 0)
text_generated = []
temperature = 1.0
model.reset_states()
for i in range(num_generate):
predictions = model(input_eval)
predictions = tf.squeeze(predictions, 0)
predictions = predictions / temperature
predicted_id = tf.random.categorical(predictions, num_samples=1)[-1,0].numpy()
input_eval = tf.expand_dims([predicted_id], 0)
text_generated.append(idx2char[predicted_id])
return (start_string + ''.join(text_generated))
# Evaluate the Model
start_string = "To be or not to be"
print(generate_text(model, start_string=start_string))
With this complete code at hand, you can easily put all the steps together in one script and run it from beginning to end. You can make modifications at various stages to suit your dataset and requirements. Do remember to adjust the hyperparameters as necessary to ensure optimal performance.
The full project code demonstrates the power of autoregressive models for the task of text generation. Whether you're aiming to generate creative content like poetry, continue a piece of text in a specific style, or simply experiment with what's possible, this project should serve as a great foundation.
Chapter 8 Conclusion
In this chapter, we embarked on a journey into the world of autoregressive models for text generation. We started by gathering and preprocessing our data, ensuring that our model had the right 'fuel' to learn from. From there, we moved onto building our autoregressive model, which required us to take into account the intricacies of such models, including the design of their architecture and the use of the Transformer API.
Next, we undertook the task of training the model, where we delved into the nuances of training an autoregressive model. Once our model was well-trained, we used it to generate new text, a fascinating process that truly showcased the power of these models. We evaluated the generated text, utilizing a variety of techniques to ensure that our model was performing well and as expected. We then explored some practical considerations for working with autoregressive models.
To round off this chapter, we provided the full project code in one single script. This complete code brings together all the steps involved in generating text with an autoregressive model. The consolidation of the entire process into a single script will aid in comprehension and provides a useful resource for any future work in this area.
Through this project, we have seen the immense power and potential of autoregressive models in the field of text generation. As with any machine learning model, understanding its strengths, limitations, and potential applications is crucial to effectively applying it. With the knowledge gained in this chapter, you're now well-equipped to utilize autoregressive models in your own projects and explorations. Happy coding!