Menu iconMenu iconNatural Language Processing with Python Updated Edition
Natural Language Processing with Python Updated Edition

Chapter 1: Introduction to NLP

Practical Exercises

Exercise 1: Tokenization with NLTK

Task: Use the nltk library to tokenize the following text: "Natural Language Processing enables computers to understand human language."

Solution:

import nltk
nltk.download('punkt')
from nltk.tokenize import word_tokenize

text = "Natural Language Processing enables computers to understand human language."
tokens = word_tokenize(text)
print(tokens)

Output:

['Natural', 'Language', 'Processing', 'enables', 'computers', 'to', 'understand', 'human', 'language', '.']

Exercise 2: Named Entity Recognition with SpaCy

Task: Use the spaCy library to extract named entities from the text: "Google was founded by Larry Page and Sergey Brin while they were Ph.D. students at Stanford University."

Solution:

import spacy

# Load SpaCy model
nlp = spacy.load("en_core_web_sm")

text = "Google was founded by Larry Page and Sergey Brin while they were Ph.D. students at Stanford University."
doc = nlp(text)

# Extract named entities
for ent in doc.ents:
    print(ent.text, ent.label_)

Output:

Google ORG
Larry Page PERSON
Sergey Brin PERSON
Stanford University ORG

Exercise 3: Sentiment Analysis with TextBlob

Task: Use the TextBlob library to analyze the sentiment of the following text: "I am extremely happy with the service provided."

Solution:

from textblob import TextBlob

text = "I am extremely happy with the service provided."
blob = TextBlob(text)
sentiment = blob.sentiment
print(sentiment)

Output:

Sentiment(polarity=0.8, subjectivity=0.75)

Exercise 4: Text Summarization with sumy

Task: Use the sumy library to summarize the following text into two sentences:

"Natural Language Processing (NLP) is a fascinating field at the intersection of computer science, artificial intelligence, and linguistics. It enables machines to understand, interpret, and generate human language, opening up a world of possibilities for applications ranging from chatbots and translation services to sentiment analysis and beyond. The evolution of NLP has been driven by significant advances in machine learning and deep learning, which have enabled more sophisticated and accurate models for language understanding. This book aims to bring these cutting-edge techniques to you in an accessible and practical way, regardless of your current level of expertise."

Solution:

from sumy.parsers.plaintext import PlaintextParser
from sumy.nlp.tokenizers import Tokenizer
from sumy.summarizers.lsa import LsaSummarizer

text = """
Natural Language Processing (NLP) is a fascinating field at the intersection of computer science, artificial intelligence, and linguistics. It enables machines to understand, interpret, and generate human language, opening up a world of possibilities for applications ranging from chatbots and translation services to sentiment analysis and beyond. The evolution of NLP has been driven by significant advances in machine learning and deep learning, which have enabled more sophisticated and accurate models for language understanding. This book aims to bring these cutting-edge techniques to you in an accessible and practical way, regardless of your current level of expertise.
"""

parser = PlaintextParser.from_string(text, Tokenizer("english"))
summarizer = LsaSummarizer()
summary = summarizer(parser.document, 2)
for sentence in summary:
    print(sentence)

Output:

It enables machines to understand, interpret, and generate human language, opening up a world of possibilities for applications ranging from chatbots and translation services to sentiment analysis and beyond.
The evolution of NLP has been driven by significant advances in machine learning and deep learning, which have enabled more sophisticated and accurate models for language understanding.

Exercise 5: Text Classification with scikit-learn

Task: Use scikit-learn to train a Naive Bayes classifier on the following data and predict the sentiment of a new text: "This experience was fantastic."

Texts:

  • "I love this product" (positive)
  • "This is the worst experience" (negative)
  • "Absolutely fantastic!" (positive)
  • "Not good at all" (negative)

Solution:

from sklearn.feature_extraction.text import CountVectorizer
from sklearn.naive_bayes import MultinomialNB

texts = ["I love this product", "This is the worst experience", "Absolutely fantastic!", "Not good at all"]
labels = [1, 0, 1, 0]

vectorizer = CountVectorizer()
X = vectorizer.fit_transform(texts)

classifier = MultinomialNB()
classifier.fit(X, labels)

new_text = ["This experience was fantastic"]
X_new = vectorizer.transform(new_text)
prediction = classifier.predict(X_new)
print(prediction)

Output:

[1]

These practical exercises provide hands-on experience with different aspects of NLP using Python. Each exercise is designed to reinforce the concepts discussed in the chapter and help you become proficient in implementing NLP techniques.

Practical Exercises

Exercise 1: Tokenization with NLTK

Task: Use the nltk library to tokenize the following text: "Natural Language Processing enables computers to understand human language."

Solution:

import nltk
nltk.download('punkt')
from nltk.tokenize import word_tokenize

text = "Natural Language Processing enables computers to understand human language."
tokens = word_tokenize(text)
print(tokens)

Output:

['Natural', 'Language', 'Processing', 'enables', 'computers', 'to', 'understand', 'human', 'language', '.']

Exercise 2: Named Entity Recognition with SpaCy

Task: Use the spaCy library to extract named entities from the text: "Google was founded by Larry Page and Sergey Brin while they were Ph.D. students at Stanford University."

Solution:

import spacy

# Load SpaCy model
nlp = spacy.load("en_core_web_sm")

text = "Google was founded by Larry Page and Sergey Brin while they were Ph.D. students at Stanford University."
doc = nlp(text)

# Extract named entities
for ent in doc.ents:
    print(ent.text, ent.label_)

Output:

Google ORG
Larry Page PERSON
Sergey Brin PERSON
Stanford University ORG

Exercise 3: Sentiment Analysis with TextBlob

Task: Use the TextBlob library to analyze the sentiment of the following text: "I am extremely happy with the service provided."

Solution:

from textblob import TextBlob

text = "I am extremely happy with the service provided."
blob = TextBlob(text)
sentiment = blob.sentiment
print(sentiment)

Output:

Sentiment(polarity=0.8, subjectivity=0.75)

Exercise 4: Text Summarization with sumy

Task: Use the sumy library to summarize the following text into two sentences:

"Natural Language Processing (NLP) is a fascinating field at the intersection of computer science, artificial intelligence, and linguistics. It enables machines to understand, interpret, and generate human language, opening up a world of possibilities for applications ranging from chatbots and translation services to sentiment analysis and beyond. The evolution of NLP has been driven by significant advances in machine learning and deep learning, which have enabled more sophisticated and accurate models for language understanding. This book aims to bring these cutting-edge techniques to you in an accessible and practical way, regardless of your current level of expertise."

Solution:

from sumy.parsers.plaintext import PlaintextParser
from sumy.nlp.tokenizers import Tokenizer
from sumy.summarizers.lsa import LsaSummarizer

text = """
Natural Language Processing (NLP) is a fascinating field at the intersection of computer science, artificial intelligence, and linguistics. It enables machines to understand, interpret, and generate human language, opening up a world of possibilities for applications ranging from chatbots and translation services to sentiment analysis and beyond. The evolution of NLP has been driven by significant advances in machine learning and deep learning, which have enabled more sophisticated and accurate models for language understanding. This book aims to bring these cutting-edge techniques to you in an accessible and practical way, regardless of your current level of expertise.
"""

parser = PlaintextParser.from_string(text, Tokenizer("english"))
summarizer = LsaSummarizer()
summary = summarizer(parser.document, 2)
for sentence in summary:
    print(sentence)

Output:

It enables machines to understand, interpret, and generate human language, opening up a world of possibilities for applications ranging from chatbots and translation services to sentiment analysis and beyond.
The evolution of NLP has been driven by significant advances in machine learning and deep learning, which have enabled more sophisticated and accurate models for language understanding.

Exercise 5: Text Classification with scikit-learn

Task: Use scikit-learn to train a Naive Bayes classifier on the following data and predict the sentiment of a new text: "This experience was fantastic."

Texts:

  • "I love this product" (positive)
  • "This is the worst experience" (negative)
  • "Absolutely fantastic!" (positive)
  • "Not good at all" (negative)

Solution:

from sklearn.feature_extraction.text import CountVectorizer
from sklearn.naive_bayes import MultinomialNB

texts = ["I love this product", "This is the worst experience", "Absolutely fantastic!", "Not good at all"]
labels = [1, 0, 1, 0]

vectorizer = CountVectorizer()
X = vectorizer.fit_transform(texts)

classifier = MultinomialNB()
classifier.fit(X, labels)

new_text = ["This experience was fantastic"]
X_new = vectorizer.transform(new_text)
prediction = classifier.predict(X_new)
print(prediction)

Output:

[1]

These practical exercises provide hands-on experience with different aspects of NLP using Python. Each exercise is designed to reinforce the concepts discussed in the chapter and help you become proficient in implementing NLP techniques.

Practical Exercises

Exercise 1: Tokenization with NLTK

Task: Use the nltk library to tokenize the following text: "Natural Language Processing enables computers to understand human language."

Solution:

import nltk
nltk.download('punkt')
from nltk.tokenize import word_tokenize

text = "Natural Language Processing enables computers to understand human language."
tokens = word_tokenize(text)
print(tokens)

Output:

['Natural', 'Language', 'Processing', 'enables', 'computers', 'to', 'understand', 'human', 'language', '.']

Exercise 2: Named Entity Recognition with SpaCy

Task: Use the spaCy library to extract named entities from the text: "Google was founded by Larry Page and Sergey Brin while they were Ph.D. students at Stanford University."

Solution:

import spacy

# Load SpaCy model
nlp = spacy.load("en_core_web_sm")

text = "Google was founded by Larry Page and Sergey Brin while they were Ph.D. students at Stanford University."
doc = nlp(text)

# Extract named entities
for ent in doc.ents:
    print(ent.text, ent.label_)

Output:

Google ORG
Larry Page PERSON
Sergey Brin PERSON
Stanford University ORG

Exercise 3: Sentiment Analysis with TextBlob

Task: Use the TextBlob library to analyze the sentiment of the following text: "I am extremely happy with the service provided."

Solution:

from textblob import TextBlob

text = "I am extremely happy with the service provided."
blob = TextBlob(text)
sentiment = blob.sentiment
print(sentiment)

Output:

Sentiment(polarity=0.8, subjectivity=0.75)

Exercise 4: Text Summarization with sumy

Task: Use the sumy library to summarize the following text into two sentences:

"Natural Language Processing (NLP) is a fascinating field at the intersection of computer science, artificial intelligence, and linguistics. It enables machines to understand, interpret, and generate human language, opening up a world of possibilities for applications ranging from chatbots and translation services to sentiment analysis and beyond. The evolution of NLP has been driven by significant advances in machine learning and deep learning, which have enabled more sophisticated and accurate models for language understanding. This book aims to bring these cutting-edge techniques to you in an accessible and practical way, regardless of your current level of expertise."

Solution:

from sumy.parsers.plaintext import PlaintextParser
from sumy.nlp.tokenizers import Tokenizer
from sumy.summarizers.lsa import LsaSummarizer

text = """
Natural Language Processing (NLP) is a fascinating field at the intersection of computer science, artificial intelligence, and linguistics. It enables machines to understand, interpret, and generate human language, opening up a world of possibilities for applications ranging from chatbots and translation services to sentiment analysis and beyond. The evolution of NLP has been driven by significant advances in machine learning and deep learning, which have enabled more sophisticated and accurate models for language understanding. This book aims to bring these cutting-edge techniques to you in an accessible and practical way, regardless of your current level of expertise.
"""

parser = PlaintextParser.from_string(text, Tokenizer("english"))
summarizer = LsaSummarizer()
summary = summarizer(parser.document, 2)
for sentence in summary:
    print(sentence)

Output:

It enables machines to understand, interpret, and generate human language, opening up a world of possibilities for applications ranging from chatbots and translation services to sentiment analysis and beyond.
The evolution of NLP has been driven by significant advances in machine learning and deep learning, which have enabled more sophisticated and accurate models for language understanding.

Exercise 5: Text Classification with scikit-learn

Task: Use scikit-learn to train a Naive Bayes classifier on the following data and predict the sentiment of a new text: "This experience was fantastic."

Texts:

  • "I love this product" (positive)
  • "This is the worst experience" (negative)
  • "Absolutely fantastic!" (positive)
  • "Not good at all" (negative)

Solution:

from sklearn.feature_extraction.text import CountVectorizer
from sklearn.naive_bayes import MultinomialNB

texts = ["I love this product", "This is the worst experience", "Absolutely fantastic!", "Not good at all"]
labels = [1, 0, 1, 0]

vectorizer = CountVectorizer()
X = vectorizer.fit_transform(texts)

classifier = MultinomialNB()
classifier.fit(X, labels)

new_text = ["This experience was fantastic"]
X_new = vectorizer.transform(new_text)
prediction = classifier.predict(X_new)
print(prediction)

Output:

[1]

These practical exercises provide hands-on experience with different aspects of NLP using Python. Each exercise is designed to reinforce the concepts discussed in the chapter and help you become proficient in implementing NLP techniques.

Practical Exercises

Exercise 1: Tokenization with NLTK

Task: Use the nltk library to tokenize the following text: "Natural Language Processing enables computers to understand human language."

Solution:

import nltk
nltk.download('punkt')
from nltk.tokenize import word_tokenize

text = "Natural Language Processing enables computers to understand human language."
tokens = word_tokenize(text)
print(tokens)

Output:

['Natural', 'Language', 'Processing', 'enables', 'computers', 'to', 'understand', 'human', 'language', '.']

Exercise 2: Named Entity Recognition with SpaCy

Task: Use the spaCy library to extract named entities from the text: "Google was founded by Larry Page and Sergey Brin while they were Ph.D. students at Stanford University."

Solution:

import spacy

# Load SpaCy model
nlp = spacy.load("en_core_web_sm")

text = "Google was founded by Larry Page and Sergey Brin while they were Ph.D. students at Stanford University."
doc = nlp(text)

# Extract named entities
for ent in doc.ents:
    print(ent.text, ent.label_)

Output:

Google ORG
Larry Page PERSON
Sergey Brin PERSON
Stanford University ORG

Exercise 3: Sentiment Analysis with TextBlob

Task: Use the TextBlob library to analyze the sentiment of the following text: "I am extremely happy with the service provided."

Solution:

from textblob import TextBlob

text = "I am extremely happy with the service provided."
blob = TextBlob(text)
sentiment = blob.sentiment
print(sentiment)

Output:

Sentiment(polarity=0.8, subjectivity=0.75)

Exercise 4: Text Summarization with sumy

Task: Use the sumy library to summarize the following text into two sentences:

"Natural Language Processing (NLP) is a fascinating field at the intersection of computer science, artificial intelligence, and linguistics. It enables machines to understand, interpret, and generate human language, opening up a world of possibilities for applications ranging from chatbots and translation services to sentiment analysis and beyond. The evolution of NLP has been driven by significant advances in machine learning and deep learning, which have enabled more sophisticated and accurate models for language understanding. This book aims to bring these cutting-edge techniques to you in an accessible and practical way, regardless of your current level of expertise."

Solution:

from sumy.parsers.plaintext import PlaintextParser
from sumy.nlp.tokenizers import Tokenizer
from sumy.summarizers.lsa import LsaSummarizer

text = """
Natural Language Processing (NLP) is a fascinating field at the intersection of computer science, artificial intelligence, and linguistics. It enables machines to understand, interpret, and generate human language, opening up a world of possibilities for applications ranging from chatbots and translation services to sentiment analysis and beyond. The evolution of NLP has been driven by significant advances in machine learning and deep learning, which have enabled more sophisticated and accurate models for language understanding. This book aims to bring these cutting-edge techniques to you in an accessible and practical way, regardless of your current level of expertise.
"""

parser = PlaintextParser.from_string(text, Tokenizer("english"))
summarizer = LsaSummarizer()
summary = summarizer(parser.document, 2)
for sentence in summary:
    print(sentence)

Output:

It enables machines to understand, interpret, and generate human language, opening up a world of possibilities for applications ranging from chatbots and translation services to sentiment analysis and beyond.
The evolution of NLP has been driven by significant advances in machine learning and deep learning, which have enabled more sophisticated and accurate models for language understanding.

Exercise 5: Text Classification with scikit-learn

Task: Use scikit-learn to train a Naive Bayes classifier on the following data and predict the sentiment of a new text: "This experience was fantastic."

Texts:

  • "I love this product" (positive)
  • "This is the worst experience" (negative)
  • "Absolutely fantastic!" (positive)
  • "Not good at all" (negative)

Solution:

from sklearn.feature_extraction.text import CountVectorizer
from sklearn.naive_bayes import MultinomialNB

texts = ["I love this product", "This is the worst experience", "Absolutely fantastic!", "Not good at all"]
labels = [1, 0, 1, 0]

vectorizer = CountVectorizer()
X = vectorizer.fit_transform(texts)

classifier = MultinomialNB()
classifier.fit(X, labels)

new_text = ["This experience was fantastic"]
X_new = vectorizer.transform(new_text)
prediction = classifier.predict(X_new)
print(prediction)

Output:

[1]

These practical exercises provide hands-on experience with different aspects of NLP using Python. Each exercise is designed to reinforce the concepts discussed in the chapter and help you become proficient in implementing NLP techniques.