Project 4: Named Entity Recognition (NER) Pipeline with Custom Fine-Tuning
Step 4: Fine-Tune the Model
Fine-tune a pretrained transformer model for the NER task. This crucial step involves taking a pre-trained language model (like BERT) and adapting it specifically for named entity recognition. During fine-tuning, the model learns to identify and classify named entities in text by adjusting its parameters through training on labeled NER data.
This process maintains the model's general language understanding while optimizing its ability to recognize specific types of entities (like persons, organizations, and locations). The fine-tuning process typically requires less computational resources and training time compared to training a model from scratch, while still achieving high accuracy for the specific NER task.
from transformers import AutoModelForTokenClassification, TrainingArguments, Trainer
from seqeval.metrics import classification_report
# Load model
model = AutoModelForTokenClassification.from_pretrained(model_name, num_labels=len(dataset["train"].features["ner_tags"].feature.names))
# Define training arguments
training_args = TrainingArguments(
output_dir="./results",
evaluation_strategy="epoch",
learning_rate=2e-5,
per_device_train_batch_size=16,
num_train_epochs=3,
weight_decay=0.01,
save_steps=500,
logging_dir="./logs",
)
# Define compute metrics function
def compute_metrics(p):
predictions, labels = p
predictions = predictions.argmax(axis=-1)
true_labels = [[label for label in label_row if label != -100] for label_row in labels]
pred_labels = [[pred for pred, label in zip(pred_row, label_row) if label != -100] for pred_row, label_row in zip(predictions, labels)]
return {"classification_report": classification_report(true_labels, pred_labels)}
# Initialize trainer
trainer = Trainer(
model=model,
args=training_args,
train_dataset=tokenized_datasets["train"],
eval_dataset=tokenized_datasets["validation"],
tokenizer=tokenizer,
compute_metrics=compute_metrics,
)
# Train model
trainer.train()
Let's break down this code that handles the model fine-tuning process for Named Entity Recognition:
1. Initial Setup and Imports
- Imports necessary classes from transformers library for model training and evaluation
- Uses AutoModelForTokenClassification for the NER task
2. Model Loading
- Creates a model instance using a pre-trained model (defined earlier as "bert-base-uncased")
- Sets the number of output labels based on the dataset's NER tags
3. Training Configuration
Sets up TrainingArguments with specific parameters:
- Learning rate: 2e-5
- Batch size: 16 per device
- Number of epochs: 3
- Weight decay: 0.01 for regularization
- Evaluation performed after each epoch
4. Metrics Computation
- Implements a compute_metrics function that:
- Processes model predictions
- Filters out padding tokens (-100)
- Returns classification report using seqeval metrics
5. Trainer Setup and Execution
- Initializes a Trainer instance with:
- The model
- Training arguments
- Training and validation datasets
- Tokenizer
- Metrics computation function
- Starts the training process with trainer.train()
This fine-tuning process optimizes the pre-trained model specifically for NER tasks while requiring less computational resources compared to training from scratch.
Step 4: Fine-Tune the Model
Fine-tune a pretrained transformer model for the NER task. This crucial step involves taking a pre-trained language model (like BERT) and adapting it specifically for named entity recognition. During fine-tuning, the model learns to identify and classify named entities in text by adjusting its parameters through training on labeled NER data.
This process maintains the model's general language understanding while optimizing its ability to recognize specific types of entities (like persons, organizations, and locations). The fine-tuning process typically requires less computational resources and training time compared to training a model from scratch, while still achieving high accuracy for the specific NER task.
from transformers import AutoModelForTokenClassification, TrainingArguments, Trainer
from seqeval.metrics import classification_report
# Load model
model = AutoModelForTokenClassification.from_pretrained(model_name, num_labels=len(dataset["train"].features["ner_tags"].feature.names))
# Define training arguments
training_args = TrainingArguments(
output_dir="./results",
evaluation_strategy="epoch",
learning_rate=2e-5,
per_device_train_batch_size=16,
num_train_epochs=3,
weight_decay=0.01,
save_steps=500,
logging_dir="./logs",
)
# Define compute metrics function
def compute_metrics(p):
predictions, labels = p
predictions = predictions.argmax(axis=-1)
true_labels = [[label for label in label_row if label != -100] for label_row in labels]
pred_labels = [[pred for pred, label in zip(pred_row, label_row) if label != -100] for pred_row, label_row in zip(predictions, labels)]
return {"classification_report": classification_report(true_labels, pred_labels)}
# Initialize trainer
trainer = Trainer(
model=model,
args=training_args,
train_dataset=tokenized_datasets["train"],
eval_dataset=tokenized_datasets["validation"],
tokenizer=tokenizer,
compute_metrics=compute_metrics,
)
# Train model
trainer.train()
Let's break down this code that handles the model fine-tuning process for Named Entity Recognition:
1. Initial Setup and Imports
- Imports necessary classes from transformers library for model training and evaluation
- Uses AutoModelForTokenClassification for the NER task
2. Model Loading
- Creates a model instance using a pre-trained model (defined earlier as "bert-base-uncased")
- Sets the number of output labels based on the dataset's NER tags
3. Training Configuration
Sets up TrainingArguments with specific parameters:
- Learning rate: 2e-5
- Batch size: 16 per device
- Number of epochs: 3
- Weight decay: 0.01 for regularization
- Evaluation performed after each epoch
4. Metrics Computation
- Implements a compute_metrics function that:
- Processes model predictions
- Filters out padding tokens (-100)
- Returns classification report using seqeval metrics
5. Trainer Setup and Execution
- Initializes a Trainer instance with:
- The model
- Training arguments
- Training and validation datasets
- Tokenizer
- Metrics computation function
- Starts the training process with trainer.train()
This fine-tuning process optimizes the pre-trained model specifically for NER tasks while requiring less computational resources compared to training from scratch.
Step 4: Fine-Tune the Model
Fine-tune a pretrained transformer model for the NER task. This crucial step involves taking a pre-trained language model (like BERT) and adapting it specifically for named entity recognition. During fine-tuning, the model learns to identify and classify named entities in text by adjusting its parameters through training on labeled NER data.
This process maintains the model's general language understanding while optimizing its ability to recognize specific types of entities (like persons, organizations, and locations). The fine-tuning process typically requires less computational resources and training time compared to training a model from scratch, while still achieving high accuracy for the specific NER task.
from transformers import AutoModelForTokenClassification, TrainingArguments, Trainer
from seqeval.metrics import classification_report
# Load model
model = AutoModelForTokenClassification.from_pretrained(model_name, num_labels=len(dataset["train"].features["ner_tags"].feature.names))
# Define training arguments
training_args = TrainingArguments(
output_dir="./results",
evaluation_strategy="epoch",
learning_rate=2e-5,
per_device_train_batch_size=16,
num_train_epochs=3,
weight_decay=0.01,
save_steps=500,
logging_dir="./logs",
)
# Define compute metrics function
def compute_metrics(p):
predictions, labels = p
predictions = predictions.argmax(axis=-1)
true_labels = [[label for label in label_row if label != -100] for label_row in labels]
pred_labels = [[pred for pred, label in zip(pred_row, label_row) if label != -100] for pred_row, label_row in zip(predictions, labels)]
return {"classification_report": classification_report(true_labels, pred_labels)}
# Initialize trainer
trainer = Trainer(
model=model,
args=training_args,
train_dataset=tokenized_datasets["train"],
eval_dataset=tokenized_datasets["validation"],
tokenizer=tokenizer,
compute_metrics=compute_metrics,
)
# Train model
trainer.train()
Let's break down this code that handles the model fine-tuning process for Named Entity Recognition:
1. Initial Setup and Imports
- Imports necessary classes from transformers library for model training and evaluation
- Uses AutoModelForTokenClassification for the NER task
2. Model Loading
- Creates a model instance using a pre-trained model (defined earlier as "bert-base-uncased")
- Sets the number of output labels based on the dataset's NER tags
3. Training Configuration
Sets up TrainingArguments with specific parameters:
- Learning rate: 2e-5
- Batch size: 16 per device
- Number of epochs: 3
- Weight decay: 0.01 for regularization
- Evaluation performed after each epoch
4. Metrics Computation
- Implements a compute_metrics function that:
- Processes model predictions
- Filters out padding tokens (-100)
- Returns classification report using seqeval metrics
5. Trainer Setup and Execution
- Initializes a Trainer instance with:
- The model
- Training arguments
- Training and validation datasets
- Tokenizer
- Metrics computation function
- Starts the training process with trainer.train()
This fine-tuning process optimizes the pre-trained model specifically for NER tasks while requiring less computational resources compared to training from scratch.
Step 4: Fine-Tune the Model
Fine-tune a pretrained transformer model for the NER task. This crucial step involves taking a pre-trained language model (like BERT) and adapting it specifically for named entity recognition. During fine-tuning, the model learns to identify and classify named entities in text by adjusting its parameters through training on labeled NER data.
This process maintains the model's general language understanding while optimizing its ability to recognize specific types of entities (like persons, organizations, and locations). The fine-tuning process typically requires less computational resources and training time compared to training a model from scratch, while still achieving high accuracy for the specific NER task.
from transformers import AutoModelForTokenClassification, TrainingArguments, Trainer
from seqeval.metrics import classification_report
# Load model
model = AutoModelForTokenClassification.from_pretrained(model_name, num_labels=len(dataset["train"].features["ner_tags"].feature.names))
# Define training arguments
training_args = TrainingArguments(
output_dir="./results",
evaluation_strategy="epoch",
learning_rate=2e-5,
per_device_train_batch_size=16,
num_train_epochs=3,
weight_decay=0.01,
save_steps=500,
logging_dir="./logs",
)
# Define compute metrics function
def compute_metrics(p):
predictions, labels = p
predictions = predictions.argmax(axis=-1)
true_labels = [[label for label in label_row if label != -100] for label_row in labels]
pred_labels = [[pred for pred, label in zip(pred_row, label_row) if label != -100] for pred_row, label_row in zip(predictions, labels)]
return {"classification_report": classification_report(true_labels, pred_labels)}
# Initialize trainer
trainer = Trainer(
model=model,
args=training_args,
train_dataset=tokenized_datasets["train"],
eval_dataset=tokenized_datasets["validation"],
tokenizer=tokenizer,
compute_metrics=compute_metrics,
)
# Train model
trainer.train()
Let's break down this code that handles the model fine-tuning process for Named Entity Recognition:
1. Initial Setup and Imports
- Imports necessary classes from transformers library for model training and evaluation
- Uses AutoModelForTokenClassification for the NER task
2. Model Loading
- Creates a model instance using a pre-trained model (defined earlier as "bert-base-uncased")
- Sets the number of output labels based on the dataset's NER tags
3. Training Configuration
Sets up TrainingArguments with specific parameters:
- Learning rate: 2e-5
- Batch size: 16 per device
- Number of epochs: 3
- Weight decay: 0.01 for regularization
- Evaluation performed after each epoch
4. Metrics Computation
- Implements a compute_metrics function that:
- Processes model predictions
- Filters out padding tokens (-100)
- Returns classification report using seqeval metrics
5. Trainer Setup and Execution
- Initializes a Trainer instance with:
- The model
- Training arguments
- Training and validation datasets
- Tokenizer
- Metrics computation function
- Starts the training process with trainer.train()
This fine-tuning process optimizes the pre-trained model specifically for NER tasks while requiring less computational resources compared to training from scratch.