Menu iconMenu iconGenerative Deep Learning Updated Edition
Generative Deep Learning Updated Edition

Chapter 8: Project: Text Generation with Autoregressive Models

8.3 Generating Text with the Fine-Tuned Model

In this section, we will focus on generating text using the fine-tuned GPT-2 model. Text generation involves using the trained model to predict the next words in a sequence, creating coherent and contextually relevant text based on a given prompt. We will explore how to generate text with various parameters and evaluate the quality of the generated text.

8.3.1 Generating Text with a Prompt

The primary use of our fine-tuned GPT-2 model is to generate text based on an initial prompt. This involves providing the model with a starting sequence of words and letting it predict subsequent words to complete the text.

Example: Generating Text with a Prompt

# Define the prompt
prompt = "In the heart of the city, there was a secret garden where"

# Tokenize the prompt
input_ids = tokenizer.encode(prompt, return_tensors='pt')

# Generate text
output = model.generate(input_ids, max_length=100, num_return_sequences=1)

# Decode the generated text
generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
print(generated_text)

This example code is a simple script that uses a pre-trained language model to generate text based on a given prompt. The prompt, "In the heart of the city, there was a secret garden where", is tokenized (converted into a format that the model can understand), and then passed to the model. The model then generates a sequence of words that continues the prompt, up to a maximum length of 100 words. The generated text is then decoded back into human-readable format and printed to the console.

8.3.2 Adjusting Generation Parameters

Text generation can be influenced by several parameters that control the quality, diversity, and length of the generated text. Key parameters include:

  • Max Length: The maximum number of tokens to generate.
  • Temperature: Controls the randomness of predictions by scaling the logits before applying softmax. Lower values (e.g., 0.7) make the output more deterministic, while higher values (e.g., 1.0) increase diversity.
  • Top-k Sampling: Limits the sampling pool to the top-k most probable next tokens.
  • Top-p (Nucleus) Sampling: Limits the sampling pool to the smallest set of tokens with a cumulative probability above a threshold (e.g., 0.9).

Example: Adjusting Generation Parameters

# Generate text with different parameters
output = model.generate(
    input_ids,
    max_length=150,
    num_return_sequences=1,
    temperature=0.7,
    top_k=50,
    top_p=0.9
)

# Decode and print the generated text
generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
print(generated_text)

The 'model.generate' function is called with several parameters that influence the output: 'input_ids' are the input for the model, 'max_length' is the maximum length of the generated text, 'num_return_sequences' is the number of generated sequences to return, 'temperature' affects randomness in the output (lower means more deterministic), 'top_k' limits the number of highest probability choices the model can make, and 'top_p' implements nucleus sampling, where the model only selects the smallest possible set of words whose cumulative probability exceeds the specified value. The generated output is then decoded and printed.

8.3.3 Generating Multiple Variations

One of the advantages of using a generative model like GPT-2 is the ability to generate multiple variations of text based on the same prompt. This can be useful for tasks that require creative outputs, such as story writing, dialogue generation, and content creation.

Example: Generating Multiple Variations

# Define the prompt
prompt = "The mysterious cave was hidden behind the waterfall,"

# Tokenize the prompt
input_ids = tokenizer.encode(prompt, return_tensors='pt')

# Generate multiple variations of text
outputs = model.generate(input_ids, max_length=100, num_return_sequences=3, temperature=0.7)

# Decode and print each generated variation
for i, output in enumerate(outputs):
    generated_text = tokenizer.decode(output, skip_special_tokens=True)
    print(f"Variation {i+1}:\\n{generated_text}\\n")

This Python code is using a pre-trained model and tokenizer to generate text. The input to the model is a prompt, "The mysterious cave was hidden behind the waterfall,". The model then generates three different continuations of this prompt, each with a maximum length of 100 tokens. The 'temperature' parameter controls the randomness of the output; a lower value makes the output more deterministic, while a higher value makes it more random. The generated text is then decoded (converted from token IDs back to words) and printed.

8.3.4 Handling Long-Form Text Generation

For tasks that require generating longer texts, such as articles or reports, it is important to manage the model's ability to maintain coherence and context over longer sequences. This can be achieved by generating text in chunks and feeding the generated text back into the model as a new prompt.

Example: Long-Form Text Generation

# Define the initial prompt
prompt = "In the beginning, the universe was a vast expanse of nothingness, until"

# Tokenize the prompt
input_ids = tokenizer.encode(prompt, return_tensors='pt')

# Initialize the generated text
generated_text = prompt

# Generate text in chunks
for _ in range(5):  # Generate 5 chunks of text
    output = model.generate(input_ids, max_length=100, num_return_sequences=1, temperature=0.7)
    chunk = tokenizer.decode(output[0], skip_special_tokens=True)
    generated_text += chunk
    input_ids = tokenizer.encode(chunk, return_tensors='pt')

print(generated_text)

The initial prompt ("In the beginning, the universe...") is encoded into tokens by the tokenizer, and then these tokens are passed into the model. The model generates new tokens representing a chunk of text. This chunk of text is then decoded back into human-readable text and appended to the generated text.

This process is repeated five times in a loop. Each iteration uses the last generated chunk of text as the input for the next chunk of text to be generated. The 'temperature' parameter in the 'model.generate' function controls the randomness of the output. Lower values make the output more deterministic, while higher values add more variability.

Finally, the entire generated text is printed.

8.3 Generating Text with the Fine-Tuned Model

In this section, we will focus on generating text using the fine-tuned GPT-2 model. Text generation involves using the trained model to predict the next words in a sequence, creating coherent and contextually relevant text based on a given prompt. We will explore how to generate text with various parameters and evaluate the quality of the generated text.

8.3.1 Generating Text with a Prompt

The primary use of our fine-tuned GPT-2 model is to generate text based on an initial prompt. This involves providing the model with a starting sequence of words and letting it predict subsequent words to complete the text.

Example: Generating Text with a Prompt

# Define the prompt
prompt = "In the heart of the city, there was a secret garden where"

# Tokenize the prompt
input_ids = tokenizer.encode(prompt, return_tensors='pt')

# Generate text
output = model.generate(input_ids, max_length=100, num_return_sequences=1)

# Decode the generated text
generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
print(generated_text)

This example code is a simple script that uses a pre-trained language model to generate text based on a given prompt. The prompt, "In the heart of the city, there was a secret garden where", is tokenized (converted into a format that the model can understand), and then passed to the model. The model then generates a sequence of words that continues the prompt, up to a maximum length of 100 words. The generated text is then decoded back into human-readable format and printed to the console.

8.3.2 Adjusting Generation Parameters

Text generation can be influenced by several parameters that control the quality, diversity, and length of the generated text. Key parameters include:

  • Max Length: The maximum number of tokens to generate.
  • Temperature: Controls the randomness of predictions by scaling the logits before applying softmax. Lower values (e.g., 0.7) make the output more deterministic, while higher values (e.g., 1.0) increase diversity.
  • Top-k Sampling: Limits the sampling pool to the top-k most probable next tokens.
  • Top-p (Nucleus) Sampling: Limits the sampling pool to the smallest set of tokens with a cumulative probability above a threshold (e.g., 0.9).

Example: Adjusting Generation Parameters

# Generate text with different parameters
output = model.generate(
    input_ids,
    max_length=150,
    num_return_sequences=1,
    temperature=0.7,
    top_k=50,
    top_p=0.9
)

# Decode and print the generated text
generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
print(generated_text)

The 'model.generate' function is called with several parameters that influence the output: 'input_ids' are the input for the model, 'max_length' is the maximum length of the generated text, 'num_return_sequences' is the number of generated sequences to return, 'temperature' affects randomness in the output (lower means more deterministic), 'top_k' limits the number of highest probability choices the model can make, and 'top_p' implements nucleus sampling, where the model only selects the smallest possible set of words whose cumulative probability exceeds the specified value. The generated output is then decoded and printed.

8.3.3 Generating Multiple Variations

One of the advantages of using a generative model like GPT-2 is the ability to generate multiple variations of text based on the same prompt. This can be useful for tasks that require creative outputs, such as story writing, dialogue generation, and content creation.

Example: Generating Multiple Variations

# Define the prompt
prompt = "The mysterious cave was hidden behind the waterfall,"

# Tokenize the prompt
input_ids = tokenizer.encode(prompt, return_tensors='pt')

# Generate multiple variations of text
outputs = model.generate(input_ids, max_length=100, num_return_sequences=3, temperature=0.7)

# Decode and print each generated variation
for i, output in enumerate(outputs):
    generated_text = tokenizer.decode(output, skip_special_tokens=True)
    print(f"Variation {i+1}:\\n{generated_text}\\n")

This Python code is using a pre-trained model and tokenizer to generate text. The input to the model is a prompt, "The mysterious cave was hidden behind the waterfall,". The model then generates three different continuations of this prompt, each with a maximum length of 100 tokens. The 'temperature' parameter controls the randomness of the output; a lower value makes the output more deterministic, while a higher value makes it more random. The generated text is then decoded (converted from token IDs back to words) and printed.

8.3.4 Handling Long-Form Text Generation

For tasks that require generating longer texts, such as articles or reports, it is important to manage the model's ability to maintain coherence and context over longer sequences. This can be achieved by generating text in chunks and feeding the generated text back into the model as a new prompt.

Example: Long-Form Text Generation

# Define the initial prompt
prompt = "In the beginning, the universe was a vast expanse of nothingness, until"

# Tokenize the prompt
input_ids = tokenizer.encode(prompt, return_tensors='pt')

# Initialize the generated text
generated_text = prompt

# Generate text in chunks
for _ in range(5):  # Generate 5 chunks of text
    output = model.generate(input_ids, max_length=100, num_return_sequences=1, temperature=0.7)
    chunk = tokenizer.decode(output[0], skip_special_tokens=True)
    generated_text += chunk
    input_ids = tokenizer.encode(chunk, return_tensors='pt')

print(generated_text)

The initial prompt ("In the beginning, the universe...") is encoded into tokens by the tokenizer, and then these tokens are passed into the model. The model generates new tokens representing a chunk of text. This chunk of text is then decoded back into human-readable text and appended to the generated text.

This process is repeated five times in a loop. Each iteration uses the last generated chunk of text as the input for the next chunk of text to be generated. The 'temperature' parameter in the 'model.generate' function controls the randomness of the output. Lower values make the output more deterministic, while higher values add more variability.

Finally, the entire generated text is printed.

8.3 Generating Text with the Fine-Tuned Model

In this section, we will focus on generating text using the fine-tuned GPT-2 model. Text generation involves using the trained model to predict the next words in a sequence, creating coherent and contextually relevant text based on a given prompt. We will explore how to generate text with various parameters and evaluate the quality of the generated text.

8.3.1 Generating Text with a Prompt

The primary use of our fine-tuned GPT-2 model is to generate text based on an initial prompt. This involves providing the model with a starting sequence of words and letting it predict subsequent words to complete the text.

Example: Generating Text with a Prompt

# Define the prompt
prompt = "In the heart of the city, there was a secret garden where"

# Tokenize the prompt
input_ids = tokenizer.encode(prompt, return_tensors='pt')

# Generate text
output = model.generate(input_ids, max_length=100, num_return_sequences=1)

# Decode the generated text
generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
print(generated_text)

This example code is a simple script that uses a pre-trained language model to generate text based on a given prompt. The prompt, "In the heart of the city, there was a secret garden where", is tokenized (converted into a format that the model can understand), and then passed to the model. The model then generates a sequence of words that continues the prompt, up to a maximum length of 100 words. The generated text is then decoded back into human-readable format and printed to the console.

8.3.2 Adjusting Generation Parameters

Text generation can be influenced by several parameters that control the quality, diversity, and length of the generated text. Key parameters include:

  • Max Length: The maximum number of tokens to generate.
  • Temperature: Controls the randomness of predictions by scaling the logits before applying softmax. Lower values (e.g., 0.7) make the output more deterministic, while higher values (e.g., 1.0) increase diversity.
  • Top-k Sampling: Limits the sampling pool to the top-k most probable next tokens.
  • Top-p (Nucleus) Sampling: Limits the sampling pool to the smallest set of tokens with a cumulative probability above a threshold (e.g., 0.9).

Example: Adjusting Generation Parameters

# Generate text with different parameters
output = model.generate(
    input_ids,
    max_length=150,
    num_return_sequences=1,
    temperature=0.7,
    top_k=50,
    top_p=0.9
)

# Decode and print the generated text
generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
print(generated_text)

The 'model.generate' function is called with several parameters that influence the output: 'input_ids' are the input for the model, 'max_length' is the maximum length of the generated text, 'num_return_sequences' is the number of generated sequences to return, 'temperature' affects randomness in the output (lower means more deterministic), 'top_k' limits the number of highest probability choices the model can make, and 'top_p' implements nucleus sampling, where the model only selects the smallest possible set of words whose cumulative probability exceeds the specified value. The generated output is then decoded and printed.

8.3.3 Generating Multiple Variations

One of the advantages of using a generative model like GPT-2 is the ability to generate multiple variations of text based on the same prompt. This can be useful for tasks that require creative outputs, such as story writing, dialogue generation, and content creation.

Example: Generating Multiple Variations

# Define the prompt
prompt = "The mysterious cave was hidden behind the waterfall,"

# Tokenize the prompt
input_ids = tokenizer.encode(prompt, return_tensors='pt')

# Generate multiple variations of text
outputs = model.generate(input_ids, max_length=100, num_return_sequences=3, temperature=0.7)

# Decode and print each generated variation
for i, output in enumerate(outputs):
    generated_text = tokenizer.decode(output, skip_special_tokens=True)
    print(f"Variation {i+1}:\\n{generated_text}\\n")

This Python code is using a pre-trained model and tokenizer to generate text. The input to the model is a prompt, "The mysterious cave was hidden behind the waterfall,". The model then generates three different continuations of this prompt, each with a maximum length of 100 tokens. The 'temperature' parameter controls the randomness of the output; a lower value makes the output more deterministic, while a higher value makes it more random. The generated text is then decoded (converted from token IDs back to words) and printed.

8.3.4 Handling Long-Form Text Generation

For tasks that require generating longer texts, such as articles or reports, it is important to manage the model's ability to maintain coherence and context over longer sequences. This can be achieved by generating text in chunks and feeding the generated text back into the model as a new prompt.

Example: Long-Form Text Generation

# Define the initial prompt
prompt = "In the beginning, the universe was a vast expanse of nothingness, until"

# Tokenize the prompt
input_ids = tokenizer.encode(prompt, return_tensors='pt')

# Initialize the generated text
generated_text = prompt

# Generate text in chunks
for _ in range(5):  # Generate 5 chunks of text
    output = model.generate(input_ids, max_length=100, num_return_sequences=1, temperature=0.7)
    chunk = tokenizer.decode(output[0], skip_special_tokens=True)
    generated_text += chunk
    input_ids = tokenizer.encode(chunk, return_tensors='pt')

print(generated_text)

The initial prompt ("In the beginning, the universe...") is encoded into tokens by the tokenizer, and then these tokens are passed into the model. The model generates new tokens representing a chunk of text. This chunk of text is then decoded back into human-readable text and appended to the generated text.

This process is repeated five times in a loop. Each iteration uses the last generated chunk of text as the input for the next chunk of text to be generated. The 'temperature' parameter in the 'model.generate' function controls the randomness of the output. Lower values make the output more deterministic, while higher values add more variability.

Finally, the entire generated text is printed.

8.3 Generating Text with the Fine-Tuned Model

In this section, we will focus on generating text using the fine-tuned GPT-2 model. Text generation involves using the trained model to predict the next words in a sequence, creating coherent and contextually relevant text based on a given prompt. We will explore how to generate text with various parameters and evaluate the quality of the generated text.

8.3.1 Generating Text with a Prompt

The primary use of our fine-tuned GPT-2 model is to generate text based on an initial prompt. This involves providing the model with a starting sequence of words and letting it predict subsequent words to complete the text.

Example: Generating Text with a Prompt

# Define the prompt
prompt = "In the heart of the city, there was a secret garden where"

# Tokenize the prompt
input_ids = tokenizer.encode(prompt, return_tensors='pt')

# Generate text
output = model.generate(input_ids, max_length=100, num_return_sequences=1)

# Decode the generated text
generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
print(generated_text)

This example code is a simple script that uses a pre-trained language model to generate text based on a given prompt. The prompt, "In the heart of the city, there was a secret garden where", is tokenized (converted into a format that the model can understand), and then passed to the model. The model then generates a sequence of words that continues the prompt, up to a maximum length of 100 words. The generated text is then decoded back into human-readable format and printed to the console.

8.3.2 Adjusting Generation Parameters

Text generation can be influenced by several parameters that control the quality, diversity, and length of the generated text. Key parameters include:

  • Max Length: The maximum number of tokens to generate.
  • Temperature: Controls the randomness of predictions by scaling the logits before applying softmax. Lower values (e.g., 0.7) make the output more deterministic, while higher values (e.g., 1.0) increase diversity.
  • Top-k Sampling: Limits the sampling pool to the top-k most probable next tokens.
  • Top-p (Nucleus) Sampling: Limits the sampling pool to the smallest set of tokens with a cumulative probability above a threshold (e.g., 0.9).

Example: Adjusting Generation Parameters

# Generate text with different parameters
output = model.generate(
    input_ids,
    max_length=150,
    num_return_sequences=1,
    temperature=0.7,
    top_k=50,
    top_p=0.9
)

# Decode and print the generated text
generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
print(generated_text)

The 'model.generate' function is called with several parameters that influence the output: 'input_ids' are the input for the model, 'max_length' is the maximum length of the generated text, 'num_return_sequences' is the number of generated sequences to return, 'temperature' affects randomness in the output (lower means more deterministic), 'top_k' limits the number of highest probability choices the model can make, and 'top_p' implements nucleus sampling, where the model only selects the smallest possible set of words whose cumulative probability exceeds the specified value. The generated output is then decoded and printed.

8.3.3 Generating Multiple Variations

One of the advantages of using a generative model like GPT-2 is the ability to generate multiple variations of text based on the same prompt. This can be useful for tasks that require creative outputs, such as story writing, dialogue generation, and content creation.

Example: Generating Multiple Variations

# Define the prompt
prompt = "The mysterious cave was hidden behind the waterfall,"

# Tokenize the prompt
input_ids = tokenizer.encode(prompt, return_tensors='pt')

# Generate multiple variations of text
outputs = model.generate(input_ids, max_length=100, num_return_sequences=3, temperature=0.7)

# Decode and print each generated variation
for i, output in enumerate(outputs):
    generated_text = tokenizer.decode(output, skip_special_tokens=True)
    print(f"Variation {i+1}:\\n{generated_text}\\n")

This Python code is using a pre-trained model and tokenizer to generate text. The input to the model is a prompt, "The mysterious cave was hidden behind the waterfall,". The model then generates three different continuations of this prompt, each with a maximum length of 100 tokens. The 'temperature' parameter controls the randomness of the output; a lower value makes the output more deterministic, while a higher value makes it more random. The generated text is then decoded (converted from token IDs back to words) and printed.

8.3.4 Handling Long-Form Text Generation

For tasks that require generating longer texts, such as articles or reports, it is important to manage the model's ability to maintain coherence and context over longer sequences. This can be achieved by generating text in chunks and feeding the generated text back into the model as a new prompt.

Example: Long-Form Text Generation

# Define the initial prompt
prompt = "In the beginning, the universe was a vast expanse of nothingness, until"

# Tokenize the prompt
input_ids = tokenizer.encode(prompt, return_tensors='pt')

# Initialize the generated text
generated_text = prompt

# Generate text in chunks
for _ in range(5):  # Generate 5 chunks of text
    output = model.generate(input_ids, max_length=100, num_return_sequences=1, temperature=0.7)
    chunk = tokenizer.decode(output[0], skip_special_tokens=True)
    generated_text += chunk
    input_ids = tokenizer.encode(chunk, return_tensors='pt')

print(generated_text)

The initial prompt ("In the beginning, the universe...") is encoded into tokens by the tokenizer, and then these tokens are passed into the model. The model generates new tokens representing a chunk of text. This chunk of text is then decoded back into human-readable text and appended to the generated text.

This process is repeated five times in a loop. Each iteration uses the last generated chunk of text as the input for the next chunk of text to be generated. The 'temperature' parameter in the 'model.generate' function controls the randomness of the output. Lower values make the output more deterministic, while higher values add more variability.

Finally, the entire generated text is printed.