Chapter 1: Introduction to NLP
Practical Exercises
Exercise 1: Tokenization with NLTK
Task: Use the nltk
library to tokenize the following text: "Natural Language Processing enables computers to understand human language."
Solution:
import nltk
nltk.download('punkt')
from nltk.tokenize import word_tokenize
text = "Natural Language Processing enables computers to understand human language."
tokens = word_tokenize(text)
print(tokens)
Output:
['Natural', 'Language', 'Processing', 'enables', 'computers', 'to', 'understand', 'human', 'language', '.']
Exercise 2: Named Entity Recognition with SpaCy
Task: Use the spaCy
library to extract named entities from the text: "Google was founded by Larry Page and Sergey Brin while they were Ph.D. students at Stanford University."
Solution:
import spacy
# Load SpaCy model
nlp = spacy.load("en_core_web_sm")
text = "Google was founded by Larry Page and Sergey Brin while they were Ph.D. students at Stanford University."
doc = nlp(text)
# Extract named entities
for ent in doc.ents:
print(ent.text, ent.label_)
Output:
Google ORG
Larry Page PERSON
Sergey Brin PERSON
Stanford University ORG
Exercise 3: Sentiment Analysis with TextBlob
Task: Use the TextBlob
library to analyze the sentiment of the following text: "I am extremely happy with the service provided."
Solution:
from textblob import TextBlob
text = "I am extremely happy with the service provided."
blob = TextBlob(text)
sentiment = blob.sentiment
print(sentiment)
Output:
Sentiment(polarity=0.8, subjectivity=0.75)
Exercise 4: Text Summarization with sumy
Task: Use the sumy
library to summarize the following text into two sentences:
"Natural Language Processing (NLP) is a fascinating field at the intersection of computer science, artificial intelligence, and linguistics. It enables machines to understand, interpret, and generate human language, opening up a world of possibilities for applications ranging from chatbots and translation services to sentiment analysis and beyond. The evolution of NLP has been driven by significant advances in machine learning and deep learning, which have enabled more sophisticated and accurate models for language understanding. This book aims to bring these cutting-edge techniques to you in an accessible and practical way, regardless of your current level of expertise."
Solution:
from sumy.parsers.plaintext import PlaintextParser
from sumy.nlp.tokenizers import Tokenizer
from sumy.summarizers.lsa import LsaSummarizer
text = """
Natural Language Processing (NLP) is a fascinating field at the intersection of computer science, artificial intelligence, and linguistics. It enables machines to understand, interpret, and generate human language, opening up a world of possibilities for applications ranging from chatbots and translation services to sentiment analysis and beyond. The evolution of NLP has been driven by significant advances in machine learning and deep learning, which have enabled more sophisticated and accurate models for language understanding. This book aims to bring these cutting-edge techniques to you in an accessible and practical way, regardless of your current level of expertise.
"""
parser = PlaintextParser.from_string(text, Tokenizer("english"))
summarizer = LsaSummarizer()
summary = summarizer(parser.document, 2)
for sentence in summary:
print(sentence)
Output:
It enables machines to understand, interpret, and generate human language, opening up a world of possibilities for applications ranging from chatbots and translation services to sentiment analysis and beyond.
The evolution of NLP has been driven by significant advances in machine learning and deep learning, which have enabled more sophisticated and accurate models for language understanding.
Exercise 5: Text Classification with scikit-learn
Task: Use scikit-learn
to train a Naive Bayes classifier on the following data and predict the sentiment of a new text: "This experience was fantastic."
Texts:
- "I love this product" (positive)
- "This is the worst experience" (negative)
- "Absolutely fantastic!" (positive)
- "Not good at all" (negative)
Solution:
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.naive_bayes import MultinomialNB
texts = ["I love this product", "This is the worst experience", "Absolutely fantastic!", "Not good at all"]
labels = [1, 0, 1, 0]
vectorizer = CountVectorizer()
X = vectorizer.fit_transform(texts)
classifier = MultinomialNB()
classifier.fit(X, labels)
new_text = ["This experience was fantastic"]
X_new = vectorizer.transform(new_text)
prediction = classifier.predict(X_new)
print(prediction)
Output:
[1]
These practical exercises provide hands-on experience with different aspects of NLP using Python. Each exercise is designed to reinforce the concepts discussed in the chapter and help you become proficient in implementing NLP techniques.
Practical Exercises
Exercise 1: Tokenization with NLTK
Task: Use the nltk
library to tokenize the following text: "Natural Language Processing enables computers to understand human language."
Solution:
import nltk
nltk.download('punkt')
from nltk.tokenize import word_tokenize
text = "Natural Language Processing enables computers to understand human language."
tokens = word_tokenize(text)
print(tokens)
Output:
['Natural', 'Language', 'Processing', 'enables', 'computers', 'to', 'understand', 'human', 'language', '.']
Exercise 2: Named Entity Recognition with SpaCy
Task: Use the spaCy
library to extract named entities from the text: "Google was founded by Larry Page and Sergey Brin while they were Ph.D. students at Stanford University."
Solution:
import spacy
# Load SpaCy model
nlp = spacy.load("en_core_web_sm")
text = "Google was founded by Larry Page and Sergey Brin while they were Ph.D. students at Stanford University."
doc = nlp(text)
# Extract named entities
for ent in doc.ents:
print(ent.text, ent.label_)
Output:
Google ORG
Larry Page PERSON
Sergey Brin PERSON
Stanford University ORG
Exercise 3: Sentiment Analysis with TextBlob
Task: Use the TextBlob
library to analyze the sentiment of the following text: "I am extremely happy with the service provided."
Solution:
from textblob import TextBlob
text = "I am extremely happy with the service provided."
blob = TextBlob(text)
sentiment = blob.sentiment
print(sentiment)
Output:
Sentiment(polarity=0.8, subjectivity=0.75)
Exercise 4: Text Summarization with sumy
Task: Use the sumy
library to summarize the following text into two sentences:
"Natural Language Processing (NLP) is a fascinating field at the intersection of computer science, artificial intelligence, and linguistics. It enables machines to understand, interpret, and generate human language, opening up a world of possibilities for applications ranging from chatbots and translation services to sentiment analysis and beyond. The evolution of NLP has been driven by significant advances in machine learning and deep learning, which have enabled more sophisticated and accurate models for language understanding. This book aims to bring these cutting-edge techniques to you in an accessible and practical way, regardless of your current level of expertise."
Solution:
from sumy.parsers.plaintext import PlaintextParser
from sumy.nlp.tokenizers import Tokenizer
from sumy.summarizers.lsa import LsaSummarizer
text = """
Natural Language Processing (NLP) is a fascinating field at the intersection of computer science, artificial intelligence, and linguistics. It enables machines to understand, interpret, and generate human language, opening up a world of possibilities for applications ranging from chatbots and translation services to sentiment analysis and beyond. The evolution of NLP has been driven by significant advances in machine learning and deep learning, which have enabled more sophisticated and accurate models for language understanding. This book aims to bring these cutting-edge techniques to you in an accessible and practical way, regardless of your current level of expertise.
"""
parser = PlaintextParser.from_string(text, Tokenizer("english"))
summarizer = LsaSummarizer()
summary = summarizer(parser.document, 2)
for sentence in summary:
print(sentence)
Output:
It enables machines to understand, interpret, and generate human language, opening up a world of possibilities for applications ranging from chatbots and translation services to sentiment analysis and beyond.
The evolution of NLP has been driven by significant advances in machine learning and deep learning, which have enabled more sophisticated and accurate models for language understanding.
Exercise 5: Text Classification with scikit-learn
Task: Use scikit-learn
to train a Naive Bayes classifier on the following data and predict the sentiment of a new text: "This experience was fantastic."
Texts:
- "I love this product" (positive)
- "This is the worst experience" (negative)
- "Absolutely fantastic!" (positive)
- "Not good at all" (negative)
Solution:
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.naive_bayes import MultinomialNB
texts = ["I love this product", "This is the worst experience", "Absolutely fantastic!", "Not good at all"]
labels = [1, 0, 1, 0]
vectorizer = CountVectorizer()
X = vectorizer.fit_transform(texts)
classifier = MultinomialNB()
classifier.fit(X, labels)
new_text = ["This experience was fantastic"]
X_new = vectorizer.transform(new_text)
prediction = classifier.predict(X_new)
print(prediction)
Output:
[1]
These practical exercises provide hands-on experience with different aspects of NLP using Python. Each exercise is designed to reinforce the concepts discussed in the chapter and help you become proficient in implementing NLP techniques.
Practical Exercises
Exercise 1: Tokenization with NLTK
Task: Use the nltk
library to tokenize the following text: "Natural Language Processing enables computers to understand human language."
Solution:
import nltk
nltk.download('punkt')
from nltk.tokenize import word_tokenize
text = "Natural Language Processing enables computers to understand human language."
tokens = word_tokenize(text)
print(tokens)
Output:
['Natural', 'Language', 'Processing', 'enables', 'computers', 'to', 'understand', 'human', 'language', '.']
Exercise 2: Named Entity Recognition with SpaCy
Task: Use the spaCy
library to extract named entities from the text: "Google was founded by Larry Page and Sergey Brin while they were Ph.D. students at Stanford University."
Solution:
import spacy
# Load SpaCy model
nlp = spacy.load("en_core_web_sm")
text = "Google was founded by Larry Page and Sergey Brin while they were Ph.D. students at Stanford University."
doc = nlp(text)
# Extract named entities
for ent in doc.ents:
print(ent.text, ent.label_)
Output:
Google ORG
Larry Page PERSON
Sergey Brin PERSON
Stanford University ORG
Exercise 3: Sentiment Analysis with TextBlob
Task: Use the TextBlob
library to analyze the sentiment of the following text: "I am extremely happy with the service provided."
Solution:
from textblob import TextBlob
text = "I am extremely happy with the service provided."
blob = TextBlob(text)
sentiment = blob.sentiment
print(sentiment)
Output:
Sentiment(polarity=0.8, subjectivity=0.75)
Exercise 4: Text Summarization with sumy
Task: Use the sumy
library to summarize the following text into two sentences:
"Natural Language Processing (NLP) is a fascinating field at the intersection of computer science, artificial intelligence, and linguistics. It enables machines to understand, interpret, and generate human language, opening up a world of possibilities for applications ranging from chatbots and translation services to sentiment analysis and beyond. The evolution of NLP has been driven by significant advances in machine learning and deep learning, which have enabled more sophisticated and accurate models for language understanding. This book aims to bring these cutting-edge techniques to you in an accessible and practical way, regardless of your current level of expertise."
Solution:
from sumy.parsers.plaintext import PlaintextParser
from sumy.nlp.tokenizers import Tokenizer
from sumy.summarizers.lsa import LsaSummarizer
text = """
Natural Language Processing (NLP) is a fascinating field at the intersection of computer science, artificial intelligence, and linguistics. It enables machines to understand, interpret, and generate human language, opening up a world of possibilities for applications ranging from chatbots and translation services to sentiment analysis and beyond. The evolution of NLP has been driven by significant advances in machine learning and deep learning, which have enabled more sophisticated and accurate models for language understanding. This book aims to bring these cutting-edge techniques to you in an accessible and practical way, regardless of your current level of expertise.
"""
parser = PlaintextParser.from_string(text, Tokenizer("english"))
summarizer = LsaSummarizer()
summary = summarizer(parser.document, 2)
for sentence in summary:
print(sentence)
Output:
It enables machines to understand, interpret, and generate human language, opening up a world of possibilities for applications ranging from chatbots and translation services to sentiment analysis and beyond.
The evolution of NLP has been driven by significant advances in machine learning and deep learning, which have enabled more sophisticated and accurate models for language understanding.
Exercise 5: Text Classification with scikit-learn
Task: Use scikit-learn
to train a Naive Bayes classifier on the following data and predict the sentiment of a new text: "This experience was fantastic."
Texts:
- "I love this product" (positive)
- "This is the worst experience" (negative)
- "Absolutely fantastic!" (positive)
- "Not good at all" (negative)
Solution:
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.naive_bayes import MultinomialNB
texts = ["I love this product", "This is the worst experience", "Absolutely fantastic!", "Not good at all"]
labels = [1, 0, 1, 0]
vectorizer = CountVectorizer()
X = vectorizer.fit_transform(texts)
classifier = MultinomialNB()
classifier.fit(X, labels)
new_text = ["This experience was fantastic"]
X_new = vectorizer.transform(new_text)
prediction = classifier.predict(X_new)
print(prediction)
Output:
[1]
These practical exercises provide hands-on experience with different aspects of NLP using Python. Each exercise is designed to reinforce the concepts discussed in the chapter and help you become proficient in implementing NLP techniques.
Practical Exercises
Exercise 1: Tokenization with NLTK
Task: Use the nltk
library to tokenize the following text: "Natural Language Processing enables computers to understand human language."
Solution:
import nltk
nltk.download('punkt')
from nltk.tokenize import word_tokenize
text = "Natural Language Processing enables computers to understand human language."
tokens = word_tokenize(text)
print(tokens)
Output:
['Natural', 'Language', 'Processing', 'enables', 'computers', 'to', 'understand', 'human', 'language', '.']
Exercise 2: Named Entity Recognition with SpaCy
Task: Use the spaCy
library to extract named entities from the text: "Google was founded by Larry Page and Sergey Brin while they were Ph.D. students at Stanford University."
Solution:
import spacy
# Load SpaCy model
nlp = spacy.load("en_core_web_sm")
text = "Google was founded by Larry Page and Sergey Brin while they were Ph.D. students at Stanford University."
doc = nlp(text)
# Extract named entities
for ent in doc.ents:
print(ent.text, ent.label_)
Output:
Google ORG
Larry Page PERSON
Sergey Brin PERSON
Stanford University ORG
Exercise 3: Sentiment Analysis with TextBlob
Task: Use the TextBlob
library to analyze the sentiment of the following text: "I am extremely happy with the service provided."
Solution:
from textblob import TextBlob
text = "I am extremely happy with the service provided."
blob = TextBlob(text)
sentiment = blob.sentiment
print(sentiment)
Output:
Sentiment(polarity=0.8, subjectivity=0.75)
Exercise 4: Text Summarization with sumy
Task: Use the sumy
library to summarize the following text into two sentences:
"Natural Language Processing (NLP) is a fascinating field at the intersection of computer science, artificial intelligence, and linguistics. It enables machines to understand, interpret, and generate human language, opening up a world of possibilities for applications ranging from chatbots and translation services to sentiment analysis and beyond. The evolution of NLP has been driven by significant advances in machine learning and deep learning, which have enabled more sophisticated and accurate models for language understanding. This book aims to bring these cutting-edge techniques to you in an accessible and practical way, regardless of your current level of expertise."
Solution:
from sumy.parsers.plaintext import PlaintextParser
from sumy.nlp.tokenizers import Tokenizer
from sumy.summarizers.lsa import LsaSummarizer
text = """
Natural Language Processing (NLP) is a fascinating field at the intersection of computer science, artificial intelligence, and linguistics. It enables machines to understand, interpret, and generate human language, opening up a world of possibilities for applications ranging from chatbots and translation services to sentiment analysis and beyond. The evolution of NLP has been driven by significant advances in machine learning and deep learning, which have enabled more sophisticated and accurate models for language understanding. This book aims to bring these cutting-edge techniques to you in an accessible and practical way, regardless of your current level of expertise.
"""
parser = PlaintextParser.from_string(text, Tokenizer("english"))
summarizer = LsaSummarizer()
summary = summarizer(parser.document, 2)
for sentence in summary:
print(sentence)
Output:
It enables machines to understand, interpret, and generate human language, opening up a world of possibilities for applications ranging from chatbots and translation services to sentiment analysis and beyond.
The evolution of NLP has been driven by significant advances in machine learning and deep learning, which have enabled more sophisticated and accurate models for language understanding.
Exercise 5: Text Classification with scikit-learn
Task: Use scikit-learn
to train a Naive Bayes classifier on the following data and predict the sentiment of a new text: "This experience was fantastic."
Texts:
- "I love this product" (positive)
- "This is the worst experience" (negative)
- "Absolutely fantastic!" (positive)
- "Not good at all" (negative)
Solution:
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.naive_bayes import MultinomialNB
texts = ["I love this product", "This is the worst experience", "Absolutely fantastic!", "Not good at all"]
labels = [1, 0, 1, 0]
vectorizer = CountVectorizer()
X = vectorizer.fit_transform(texts)
classifier = MultinomialNB()
classifier.fit(X, labels)
new_text = ["This experience was fantastic"]
X_new = vectorizer.transform(new_text)
prediction = classifier.predict(X_new)
print(prediction)
Output:
[1]
These practical exercises provide hands-on experience with different aspects of NLP using Python. Each exercise is designed to reinforce the concepts discussed in the chapter and help you become proficient in implementing NLP techniques.