Chapter 9: Implementing Transformer Models with Popular Libraries
9.4 Named Entity Recognition with Hugging Face’s Transformers Library
Named entity recognition (NER) is a subtask of information extraction that seeks to locate and classify named entities in text into predefined categories such as person names, organizations, locations, medical codes, time expressions, quantities, monetary values, percentages, etc.
The Hugging Face's Transformers library offers several models for NER. These models have been trained on different NER tasks and datasets, and they can be fine-tuned on a specific NER task as needed. For this demonstration, we will use the BERT model:
from transformers import pipeline
# Initialize the named entity recognition pipeline
nlp = pipeline("ner", model="dbmdz/bert-large-cased-finetuned-conll03-english")
# The text to be analyzed
text = "Google was founded in September 1998 by Larry Page and Sergey Brin while they were Ph.D. students at Stanford University in California."
# Perform named entity recognition
ner_results = nlp(text)
# Print the named entities and their labels
for entity in ner_results:
print(f"{entity['entity']}: {entity['word']}")
In this example, we use the pipeline
function from the Transformers library to set up a named entity recognition pipeline. This function abstracts away the underlying complexity and allows us to use the model for NER in just a few lines of code.
We provide the identifier of the pre-trained model ('dbmdz/bert-large-cased-finetuned-conll03-english'), which is a BERT model fine-tuned on the CONLL-03 dataset for English NER. We then use the NER pipeline to analyze a given text and print out the recognized named entities and their labels.
In the next section, we'll delve into how to use Hugging Face’s Transformers Library for the task of Question Answering.
9.4 Named Entity Recognition with Hugging Face’s Transformers Library
Named entity recognition (NER) is a subtask of information extraction that seeks to locate and classify named entities in text into predefined categories such as person names, organizations, locations, medical codes, time expressions, quantities, monetary values, percentages, etc.
The Hugging Face's Transformers library offers several models for NER. These models have been trained on different NER tasks and datasets, and they can be fine-tuned on a specific NER task as needed. For this demonstration, we will use the BERT model:
from transformers import pipeline
# Initialize the named entity recognition pipeline
nlp = pipeline("ner", model="dbmdz/bert-large-cased-finetuned-conll03-english")
# The text to be analyzed
text = "Google was founded in September 1998 by Larry Page and Sergey Brin while they were Ph.D. students at Stanford University in California."
# Perform named entity recognition
ner_results = nlp(text)
# Print the named entities and their labels
for entity in ner_results:
print(f"{entity['entity']}: {entity['word']}")
In this example, we use the pipeline
function from the Transformers library to set up a named entity recognition pipeline. This function abstracts away the underlying complexity and allows us to use the model for NER in just a few lines of code.
We provide the identifier of the pre-trained model ('dbmdz/bert-large-cased-finetuned-conll03-english'), which is a BERT model fine-tuned on the CONLL-03 dataset for English NER. We then use the NER pipeline to analyze a given text and print out the recognized named entities and their labels.
In the next section, we'll delve into how to use Hugging Face’s Transformers Library for the task of Question Answering.
9.4 Named Entity Recognition with Hugging Face’s Transformers Library
Named entity recognition (NER) is a subtask of information extraction that seeks to locate and classify named entities in text into predefined categories such as person names, organizations, locations, medical codes, time expressions, quantities, monetary values, percentages, etc.
The Hugging Face's Transformers library offers several models for NER. These models have been trained on different NER tasks and datasets, and they can be fine-tuned on a specific NER task as needed. For this demonstration, we will use the BERT model:
from transformers import pipeline
# Initialize the named entity recognition pipeline
nlp = pipeline("ner", model="dbmdz/bert-large-cased-finetuned-conll03-english")
# The text to be analyzed
text = "Google was founded in September 1998 by Larry Page and Sergey Brin while they were Ph.D. students at Stanford University in California."
# Perform named entity recognition
ner_results = nlp(text)
# Print the named entities and their labels
for entity in ner_results:
print(f"{entity['entity']}: {entity['word']}")
In this example, we use the pipeline
function from the Transformers library to set up a named entity recognition pipeline. This function abstracts away the underlying complexity and allows us to use the model for NER in just a few lines of code.
We provide the identifier of the pre-trained model ('dbmdz/bert-large-cased-finetuned-conll03-english'), which is a BERT model fine-tuned on the CONLL-03 dataset for English NER. We then use the NER pipeline to analyze a given text and print out the recognized named entities and their labels.
In the next section, we'll delve into how to use Hugging Face’s Transformers Library for the task of Question Answering.
9.4 Named Entity Recognition with Hugging Face’s Transformers Library
Named entity recognition (NER) is a subtask of information extraction that seeks to locate and classify named entities in text into predefined categories such as person names, organizations, locations, medical codes, time expressions, quantities, monetary values, percentages, etc.
The Hugging Face's Transformers library offers several models for NER. These models have been trained on different NER tasks and datasets, and they can be fine-tuned on a specific NER task as needed. For this demonstration, we will use the BERT model:
from transformers import pipeline
# Initialize the named entity recognition pipeline
nlp = pipeline("ner", model="dbmdz/bert-large-cased-finetuned-conll03-english")
# The text to be analyzed
text = "Google was founded in September 1998 by Larry Page and Sergey Brin while they were Ph.D. students at Stanford University in California."
# Perform named entity recognition
ner_results = nlp(text)
# Print the named entities and their labels
for entity in ner_results:
print(f"{entity['entity']}: {entity['word']}")
In this example, we use the pipeline
function from the Transformers library to set up a named entity recognition pipeline. This function abstracts away the underlying complexity and allows us to use the model for NER in just a few lines of code.
We provide the identifier of the pre-trained model ('dbmdz/bert-large-cased-finetuned-conll03-english'), which is a BERT model fine-tuned on the CONLL-03 dataset for English NER. We then use the NER pipeline to analyze a given text and print out the recognized named entities and their labels.
In the next section, we'll delve into how to use Hugging Face’s Transformers Library for the task of Question Answering.