Chapter 3: Training and Fine-Tuning Transformers
3.4 Practical Exercises
This section provides practical exercises to strengthen your understanding of training and fine-tuning transformer models. These exercises cover data preprocessing, fine-tuning techniques, and evaluation metrics. Each exercise includes a solution with detailed code examples to guide your implementation.
Exercise 1: Data Preprocessing for Classification
Task: Preprocess text data for binary classification using a tokenizer, including tokenization, padding, and truncation.
Instructions:
- Use the BERT tokenizer to tokenize a list of text samples.
- Ensure all sequences are padded and truncated to a fixed length.
- Output the tokenized input IDs and attention masks.
Solution:
from transformers import BertTokenizer
# Initialize the BERT tokenizer
tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")
# Sample text data
texts = ["Transformers are amazing!", "They are used in many NLP tasks."]
# Tokenize the text with padding and truncation
tokenized = tokenizer(texts, padding="max_length", truncation=True, max_length=10, return_tensors="pt")
# Display tokenized output
print("Input IDs:", tokenized["input_ids"])
print("Attention Masks:", tokenized["attention_mask"])
Expected Output:
Input IDs: [[ 101 19081 2024 6429 999 102 0 0 0 0]
[ 101 2027 2024 2109 1999 2116 17953 4703 1012 102]]
Attention Masks: [[1 1 1 1 1 1 0 0 0 0]
[1 1 1 1 1 1 1 1 1 1]]
Exercise 2: Fine-Tune a Model Using LoRA
Task: Use LoRA to fine-tune a BERT model for sentiment analysis on the IMDB dataset.
Instructions:
- Install the required libraries.
- Load and preprocess the IMDB dataset.
- Apply LoRA to the BERT model.
- Fine-tune the model for two epochs.
Solution:
from datasets import load_dataset
from transformers import AutoTokenizer, AutoModelForSequenceClassification, TrainingArguments, Trainer
from peft import get_peft_model, LoraConfig, TaskType
# Load and preprocess the dataset
dataset = load_dataset("imdb")
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
def preprocess_function(examples):
return tokenizer(examples["text"], truncation=True, padding="max_length", max_length=256)
tokenized_datasets = dataset.map(preprocess_function, batched=True)
# Apply LoRA to the model
model = AutoModelForSequenceClassification.from_pretrained("bert-base-uncased", num_labels=2)
lora_config = LoraConfig(
task_type=TaskType.SEQ_CLS, r=8, lora_alpha=32, lora_dropout=0.1
)
lora_model = get_peft_model(model, lora_config)
# Define training arguments
training_args = TrainingArguments(
output_dir="./lora_results",
evaluation_strategy="epoch",
learning_rate=2e-5,
per_device_train_batch_size=8,
num_train_epochs=2
)
# Fine-tune the model
trainer = Trainer(
model=lora_model,
args=training_args,
train_dataset=tokenized_datasets["train"].shuffle(seed=42).select(range(2000)),
eval_dataset=tokenized_datasets["test"].shuffle(seed=42).select(range(500))
)
trainer.train()
Exercise 3: Evaluate a Model Using BLEU
Task: Evaluate a machine translation model’s output using the BLEU metric.
Instructions:
- Define a reference translation and a candidate translation.
- Calculate the BLEU score using NLTK.
Solution:
from nltk.translate.bleu_score import sentence_bleu, SmoothingFunction
# Reference and candidate translations
reference = ["The cat is on the mat".split()]
candidate = "The cat is on the mat".split()
# Calculate BLEU score
bleu_score = sentence_bleu(reference, candidate, smoothing_function=SmoothingFunction().method1)
print(f"BLEU Score: {bleu_score:.2f}")
Expected Output:
BLEU Score: 1.00
Exercise 4: Evaluate a Summarization Model Using ROUGE
Task: Evaluate a summarization model’s output using the ROUGE metric.
Instructions:
- Define a reference summary and a candidate summary.
- Calculate ROUGE-1, ROUGE-2, and ROUGE-L scores.
Solution:
from rouge_score import rouge_scorer
# Reference and candidate summaries
reference = "The cat is on the mat."
candidate = "The cat lies on the mat."
# Initialize ROUGE scorer
scorer = rouge_scorer.RougeScorer(['rouge1', 'rouge2', 'rougeL'], use_stemmer=True)
# Calculate ROUGE scores
scores = scorer.score(reference, candidate)
# Display results
print("ROUGE Scores:")
for key, value in scores.items():
print(f"{key}: Precision: {value.precision:.3f}, Recall: {value.recall:.3f}, F1: {value.fmeasure:.3f}")
Expected Output:
ROUGE Scores:
rouge1: Precision: 0.833, Recall: 0.833, F1: 0.833
rouge2: Precision: 0.750, Recall: 0.750, F1: 0.750
rougeL: Precision: 0.833, Recall: 0.833, F1: 0.833
Exercise 5: Evaluate Text Generation Using BERTScore
Task: Evaluate the semantic similarity between generated text and a reference using BERTScore.
Instructions:
- Define a reference and candidate text.
- Compute BERTScore using a pretrained BERT model.
Solution:
from bert_score import score
# Reference and candidate texts
references = ["The cat is on the mat."]
candidates = ["The cat lies on the mat."]
# Compute BERTScore
P, R, F1 = score(candidates, references, lang="en", model_type="bert-base-uncased")
# Display results
print(f"BERTScore Precision: {P.mean():.3f}")
print(f"BERTScore Recall: {R.mean():.3f}")
print(f"BERTScore F1: {F1.mean():.3f}")
Expected Output:
BERTScore Precision: 0.987
BERTScore Recall: 0.992
BERTScore F1: 0.989
These exercises demonstrate the key steps in data preprocessing, fine-tuning using LoRA, and evaluating transformer models with BLEU, ROUGE, and BERTScore metrics. Completing these exercises will provide practical experience and deepen your understanding of training and evaluation techniques for transformer-based NLP models.
3.4 Practical Exercises
This section provides practical exercises to strengthen your understanding of training and fine-tuning transformer models. These exercises cover data preprocessing, fine-tuning techniques, and evaluation metrics. Each exercise includes a solution with detailed code examples to guide your implementation.
Exercise 1: Data Preprocessing for Classification
Task: Preprocess text data for binary classification using a tokenizer, including tokenization, padding, and truncation.
Instructions:
- Use the BERT tokenizer to tokenize a list of text samples.
- Ensure all sequences are padded and truncated to a fixed length.
- Output the tokenized input IDs and attention masks.
Solution:
from transformers import BertTokenizer
# Initialize the BERT tokenizer
tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")
# Sample text data
texts = ["Transformers are amazing!", "They are used in many NLP tasks."]
# Tokenize the text with padding and truncation
tokenized = tokenizer(texts, padding="max_length", truncation=True, max_length=10, return_tensors="pt")
# Display tokenized output
print("Input IDs:", tokenized["input_ids"])
print("Attention Masks:", tokenized["attention_mask"])
Expected Output:
Input IDs: [[ 101 19081 2024 6429 999 102 0 0 0 0]
[ 101 2027 2024 2109 1999 2116 17953 4703 1012 102]]
Attention Masks: [[1 1 1 1 1 1 0 0 0 0]
[1 1 1 1 1 1 1 1 1 1]]
Exercise 2: Fine-Tune a Model Using LoRA
Task: Use LoRA to fine-tune a BERT model for sentiment analysis on the IMDB dataset.
Instructions:
- Install the required libraries.
- Load and preprocess the IMDB dataset.
- Apply LoRA to the BERT model.
- Fine-tune the model for two epochs.
Solution:
from datasets import load_dataset
from transformers import AutoTokenizer, AutoModelForSequenceClassification, TrainingArguments, Trainer
from peft import get_peft_model, LoraConfig, TaskType
# Load and preprocess the dataset
dataset = load_dataset("imdb")
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
def preprocess_function(examples):
return tokenizer(examples["text"], truncation=True, padding="max_length", max_length=256)
tokenized_datasets = dataset.map(preprocess_function, batched=True)
# Apply LoRA to the model
model = AutoModelForSequenceClassification.from_pretrained("bert-base-uncased", num_labels=2)
lora_config = LoraConfig(
task_type=TaskType.SEQ_CLS, r=8, lora_alpha=32, lora_dropout=0.1
)
lora_model = get_peft_model(model, lora_config)
# Define training arguments
training_args = TrainingArguments(
output_dir="./lora_results",
evaluation_strategy="epoch",
learning_rate=2e-5,
per_device_train_batch_size=8,
num_train_epochs=2
)
# Fine-tune the model
trainer = Trainer(
model=lora_model,
args=training_args,
train_dataset=tokenized_datasets["train"].shuffle(seed=42).select(range(2000)),
eval_dataset=tokenized_datasets["test"].shuffle(seed=42).select(range(500))
)
trainer.train()
Exercise 3: Evaluate a Model Using BLEU
Task: Evaluate a machine translation model’s output using the BLEU metric.
Instructions:
- Define a reference translation and a candidate translation.
- Calculate the BLEU score using NLTK.
Solution:
from nltk.translate.bleu_score import sentence_bleu, SmoothingFunction
# Reference and candidate translations
reference = ["The cat is on the mat".split()]
candidate = "The cat is on the mat".split()
# Calculate BLEU score
bleu_score = sentence_bleu(reference, candidate, smoothing_function=SmoothingFunction().method1)
print(f"BLEU Score: {bleu_score:.2f}")
Expected Output:
BLEU Score: 1.00
Exercise 4: Evaluate a Summarization Model Using ROUGE
Task: Evaluate a summarization model’s output using the ROUGE metric.
Instructions:
- Define a reference summary and a candidate summary.
- Calculate ROUGE-1, ROUGE-2, and ROUGE-L scores.
Solution:
from rouge_score import rouge_scorer
# Reference and candidate summaries
reference = "The cat is on the mat."
candidate = "The cat lies on the mat."
# Initialize ROUGE scorer
scorer = rouge_scorer.RougeScorer(['rouge1', 'rouge2', 'rougeL'], use_stemmer=True)
# Calculate ROUGE scores
scores = scorer.score(reference, candidate)
# Display results
print("ROUGE Scores:")
for key, value in scores.items():
print(f"{key}: Precision: {value.precision:.3f}, Recall: {value.recall:.3f}, F1: {value.fmeasure:.3f}")
Expected Output:
ROUGE Scores:
rouge1: Precision: 0.833, Recall: 0.833, F1: 0.833
rouge2: Precision: 0.750, Recall: 0.750, F1: 0.750
rougeL: Precision: 0.833, Recall: 0.833, F1: 0.833
Exercise 5: Evaluate Text Generation Using BERTScore
Task: Evaluate the semantic similarity between generated text and a reference using BERTScore.
Instructions:
- Define a reference and candidate text.
- Compute BERTScore using a pretrained BERT model.
Solution:
from bert_score import score
# Reference and candidate texts
references = ["The cat is on the mat."]
candidates = ["The cat lies on the mat."]
# Compute BERTScore
P, R, F1 = score(candidates, references, lang="en", model_type="bert-base-uncased")
# Display results
print(f"BERTScore Precision: {P.mean():.3f}")
print(f"BERTScore Recall: {R.mean():.3f}")
print(f"BERTScore F1: {F1.mean():.3f}")
Expected Output:
BERTScore Precision: 0.987
BERTScore Recall: 0.992
BERTScore F1: 0.989
These exercises demonstrate the key steps in data preprocessing, fine-tuning using LoRA, and evaluating transformer models with BLEU, ROUGE, and BERTScore metrics. Completing these exercises will provide practical experience and deepen your understanding of training and evaluation techniques for transformer-based NLP models.
3.4 Practical Exercises
This section provides practical exercises to strengthen your understanding of training and fine-tuning transformer models. These exercises cover data preprocessing, fine-tuning techniques, and evaluation metrics. Each exercise includes a solution with detailed code examples to guide your implementation.
Exercise 1: Data Preprocessing for Classification
Task: Preprocess text data for binary classification using a tokenizer, including tokenization, padding, and truncation.
Instructions:
- Use the BERT tokenizer to tokenize a list of text samples.
- Ensure all sequences are padded and truncated to a fixed length.
- Output the tokenized input IDs and attention masks.
Solution:
from transformers import BertTokenizer
# Initialize the BERT tokenizer
tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")
# Sample text data
texts = ["Transformers are amazing!", "They are used in many NLP tasks."]
# Tokenize the text with padding and truncation
tokenized = tokenizer(texts, padding="max_length", truncation=True, max_length=10, return_tensors="pt")
# Display tokenized output
print("Input IDs:", tokenized["input_ids"])
print("Attention Masks:", tokenized["attention_mask"])
Expected Output:
Input IDs: [[ 101 19081 2024 6429 999 102 0 0 0 0]
[ 101 2027 2024 2109 1999 2116 17953 4703 1012 102]]
Attention Masks: [[1 1 1 1 1 1 0 0 0 0]
[1 1 1 1 1 1 1 1 1 1]]
Exercise 2: Fine-Tune a Model Using LoRA
Task: Use LoRA to fine-tune a BERT model for sentiment analysis on the IMDB dataset.
Instructions:
- Install the required libraries.
- Load and preprocess the IMDB dataset.
- Apply LoRA to the BERT model.
- Fine-tune the model for two epochs.
Solution:
from datasets import load_dataset
from transformers import AutoTokenizer, AutoModelForSequenceClassification, TrainingArguments, Trainer
from peft import get_peft_model, LoraConfig, TaskType
# Load and preprocess the dataset
dataset = load_dataset("imdb")
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
def preprocess_function(examples):
return tokenizer(examples["text"], truncation=True, padding="max_length", max_length=256)
tokenized_datasets = dataset.map(preprocess_function, batched=True)
# Apply LoRA to the model
model = AutoModelForSequenceClassification.from_pretrained("bert-base-uncased", num_labels=2)
lora_config = LoraConfig(
task_type=TaskType.SEQ_CLS, r=8, lora_alpha=32, lora_dropout=0.1
)
lora_model = get_peft_model(model, lora_config)
# Define training arguments
training_args = TrainingArguments(
output_dir="./lora_results",
evaluation_strategy="epoch",
learning_rate=2e-5,
per_device_train_batch_size=8,
num_train_epochs=2
)
# Fine-tune the model
trainer = Trainer(
model=lora_model,
args=training_args,
train_dataset=tokenized_datasets["train"].shuffle(seed=42).select(range(2000)),
eval_dataset=tokenized_datasets["test"].shuffle(seed=42).select(range(500))
)
trainer.train()
Exercise 3: Evaluate a Model Using BLEU
Task: Evaluate a machine translation model’s output using the BLEU metric.
Instructions:
- Define a reference translation and a candidate translation.
- Calculate the BLEU score using NLTK.
Solution:
from nltk.translate.bleu_score import sentence_bleu, SmoothingFunction
# Reference and candidate translations
reference = ["The cat is on the mat".split()]
candidate = "The cat is on the mat".split()
# Calculate BLEU score
bleu_score = sentence_bleu(reference, candidate, smoothing_function=SmoothingFunction().method1)
print(f"BLEU Score: {bleu_score:.2f}")
Expected Output:
BLEU Score: 1.00
Exercise 4: Evaluate a Summarization Model Using ROUGE
Task: Evaluate a summarization model’s output using the ROUGE metric.
Instructions:
- Define a reference summary and a candidate summary.
- Calculate ROUGE-1, ROUGE-2, and ROUGE-L scores.
Solution:
from rouge_score import rouge_scorer
# Reference and candidate summaries
reference = "The cat is on the mat."
candidate = "The cat lies on the mat."
# Initialize ROUGE scorer
scorer = rouge_scorer.RougeScorer(['rouge1', 'rouge2', 'rougeL'], use_stemmer=True)
# Calculate ROUGE scores
scores = scorer.score(reference, candidate)
# Display results
print("ROUGE Scores:")
for key, value in scores.items():
print(f"{key}: Precision: {value.precision:.3f}, Recall: {value.recall:.3f}, F1: {value.fmeasure:.3f}")
Expected Output:
ROUGE Scores:
rouge1: Precision: 0.833, Recall: 0.833, F1: 0.833
rouge2: Precision: 0.750, Recall: 0.750, F1: 0.750
rougeL: Precision: 0.833, Recall: 0.833, F1: 0.833
Exercise 5: Evaluate Text Generation Using BERTScore
Task: Evaluate the semantic similarity between generated text and a reference using BERTScore.
Instructions:
- Define a reference and candidate text.
- Compute BERTScore using a pretrained BERT model.
Solution:
from bert_score import score
# Reference and candidate texts
references = ["The cat is on the mat."]
candidates = ["The cat lies on the mat."]
# Compute BERTScore
P, R, F1 = score(candidates, references, lang="en", model_type="bert-base-uncased")
# Display results
print(f"BERTScore Precision: {P.mean():.3f}")
print(f"BERTScore Recall: {R.mean():.3f}")
print(f"BERTScore F1: {F1.mean():.3f}")
Expected Output:
BERTScore Precision: 0.987
BERTScore Recall: 0.992
BERTScore F1: 0.989
These exercises demonstrate the key steps in data preprocessing, fine-tuning using LoRA, and evaluating transformer models with BLEU, ROUGE, and BERTScore metrics. Completing these exercises will provide practical experience and deepen your understanding of training and evaluation techniques for transformer-based NLP models.
3.4 Practical Exercises
This section provides practical exercises to strengthen your understanding of training and fine-tuning transformer models. These exercises cover data preprocessing, fine-tuning techniques, and evaluation metrics. Each exercise includes a solution with detailed code examples to guide your implementation.
Exercise 1: Data Preprocessing for Classification
Task: Preprocess text data for binary classification using a tokenizer, including tokenization, padding, and truncation.
Instructions:
- Use the BERT tokenizer to tokenize a list of text samples.
- Ensure all sequences are padded and truncated to a fixed length.
- Output the tokenized input IDs and attention masks.
Solution:
from transformers import BertTokenizer
# Initialize the BERT tokenizer
tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")
# Sample text data
texts = ["Transformers are amazing!", "They are used in many NLP tasks."]
# Tokenize the text with padding and truncation
tokenized = tokenizer(texts, padding="max_length", truncation=True, max_length=10, return_tensors="pt")
# Display tokenized output
print("Input IDs:", tokenized["input_ids"])
print("Attention Masks:", tokenized["attention_mask"])
Expected Output:
Input IDs: [[ 101 19081 2024 6429 999 102 0 0 0 0]
[ 101 2027 2024 2109 1999 2116 17953 4703 1012 102]]
Attention Masks: [[1 1 1 1 1 1 0 0 0 0]
[1 1 1 1 1 1 1 1 1 1]]
Exercise 2: Fine-Tune a Model Using LoRA
Task: Use LoRA to fine-tune a BERT model for sentiment analysis on the IMDB dataset.
Instructions:
- Install the required libraries.
- Load and preprocess the IMDB dataset.
- Apply LoRA to the BERT model.
- Fine-tune the model for two epochs.
Solution:
from datasets import load_dataset
from transformers import AutoTokenizer, AutoModelForSequenceClassification, TrainingArguments, Trainer
from peft import get_peft_model, LoraConfig, TaskType
# Load and preprocess the dataset
dataset = load_dataset("imdb")
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
def preprocess_function(examples):
return tokenizer(examples["text"], truncation=True, padding="max_length", max_length=256)
tokenized_datasets = dataset.map(preprocess_function, batched=True)
# Apply LoRA to the model
model = AutoModelForSequenceClassification.from_pretrained("bert-base-uncased", num_labels=2)
lora_config = LoraConfig(
task_type=TaskType.SEQ_CLS, r=8, lora_alpha=32, lora_dropout=0.1
)
lora_model = get_peft_model(model, lora_config)
# Define training arguments
training_args = TrainingArguments(
output_dir="./lora_results",
evaluation_strategy="epoch",
learning_rate=2e-5,
per_device_train_batch_size=8,
num_train_epochs=2
)
# Fine-tune the model
trainer = Trainer(
model=lora_model,
args=training_args,
train_dataset=tokenized_datasets["train"].shuffle(seed=42).select(range(2000)),
eval_dataset=tokenized_datasets["test"].shuffle(seed=42).select(range(500))
)
trainer.train()
Exercise 3: Evaluate a Model Using BLEU
Task: Evaluate a machine translation model’s output using the BLEU metric.
Instructions:
- Define a reference translation and a candidate translation.
- Calculate the BLEU score using NLTK.
Solution:
from nltk.translate.bleu_score import sentence_bleu, SmoothingFunction
# Reference and candidate translations
reference = ["The cat is on the mat".split()]
candidate = "The cat is on the mat".split()
# Calculate BLEU score
bleu_score = sentence_bleu(reference, candidate, smoothing_function=SmoothingFunction().method1)
print(f"BLEU Score: {bleu_score:.2f}")
Expected Output:
BLEU Score: 1.00
Exercise 4: Evaluate a Summarization Model Using ROUGE
Task: Evaluate a summarization model’s output using the ROUGE metric.
Instructions:
- Define a reference summary and a candidate summary.
- Calculate ROUGE-1, ROUGE-2, and ROUGE-L scores.
Solution:
from rouge_score import rouge_scorer
# Reference and candidate summaries
reference = "The cat is on the mat."
candidate = "The cat lies on the mat."
# Initialize ROUGE scorer
scorer = rouge_scorer.RougeScorer(['rouge1', 'rouge2', 'rougeL'], use_stemmer=True)
# Calculate ROUGE scores
scores = scorer.score(reference, candidate)
# Display results
print("ROUGE Scores:")
for key, value in scores.items():
print(f"{key}: Precision: {value.precision:.3f}, Recall: {value.recall:.3f}, F1: {value.fmeasure:.3f}")
Expected Output:
ROUGE Scores:
rouge1: Precision: 0.833, Recall: 0.833, F1: 0.833
rouge2: Precision: 0.750, Recall: 0.750, F1: 0.750
rougeL: Precision: 0.833, Recall: 0.833, F1: 0.833
Exercise 5: Evaluate Text Generation Using BERTScore
Task: Evaluate the semantic similarity between generated text and a reference using BERTScore.
Instructions:
- Define a reference and candidate text.
- Compute BERTScore using a pretrained BERT model.
Solution:
from bert_score import score
# Reference and candidate texts
references = ["The cat is on the mat."]
candidates = ["The cat lies on the mat."]
# Compute BERTScore
P, R, F1 = score(candidates, references, lang="en", model_type="bert-base-uncased")
# Display results
print(f"BERTScore Precision: {P.mean():.3f}")
print(f"BERTScore Recall: {R.mean():.3f}")
print(f"BERTScore F1: {F1.mean():.3f}")
Expected Output:
BERTScore Precision: 0.987
BERTScore Recall: 0.992
BERTScore F1: 0.989
These exercises demonstrate the key steps in data preprocessing, fine-tuning using LoRA, and evaluating transformer models with BLEU, ROUGE, and BERTScore metrics. Completing these exercises will provide practical experience and deepen your understanding of training and evaluation techniques for transformer-based NLP models.