Chapter 9: Deciphering Strings and Patterns
Chapter 9: Practical Exercises of Deciphering Strings and Patterns
The following exercises offer practical applications of the concepts discussed in Chapter 9. They provide hands-on experience with string algorithms and text analysis techniques, reinforcing understanding and demonstrating the utility of these methods in real-world scenarios.
Exercise 1: Implement the Boyer-Moore Algorithm for Pattern Searching
- Objective: Write a function to perform the Boyer-Moore string search algorithm, an efficient method for finding substrings in larger text strings.
- Note: The Boyer-Moore algorithm is complex, focusing on the "bad character" heuristic.
Solution:
def boyer_moore_search(text, pattern):
def bad_char_heuristic(pattern):
bad_char = [-1] * 256
for i in range(len(pattern)):
bad_char[ord(pattern[i])] = i
return bad_char
m = len(pattern)
n = len(text)
bad_char = bad_char_heuristic(pattern)
s = 0
while s <= n - m:
j = m - 1
while j >= 0 and pattern[j] == text[s + j]:
j -= 1
if j < 0:
return f"Pattern occurs at shift {s}"
s += (m - bad_char[ord(text[s + m])] if s + m < n else 1)
else:
s += max(1, j - bad_char[ord(text[s + j])])
return "Pattern not found"
# Example Usage
text = "ABAAABCD"
pattern = "ABC"
print(boyer_moore_search(text, pattern)) # Output: Pattern occurs at shift 4
Exercise 2: Create a Basic Regex for Email Extraction
- Objective: Write a regex pattern to extract email addresses from a given string.
- Note: The regex should match most common email formats.
Solution:
import re
def extract_emails(text):
pattern = r'[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}'
return re.findall(pattern, text)
# Example Usage
text = "Please contact us at info@example.com or support@domain.org."
print(extract_emails(text)) # Output: ['info@example.com', 'support@domain.org']
Exercise 3: Implementing a Simple Suffix Array Construction
- Objective: Construct a suffix array for a given text string.
- Note: Suffix arrays are a space-efficient way to store and search for substrings.
Solution:
def build_suffix_array(s):
return sorted(range(len(s)), key=lambda k: s[k:])
# Example Usage
text = "banana"
print(build_suffix_array(text)) # Output: Suffix array indices for the string "banana"
Exercise 4: Sentiment Analysis Using Pre-trained Models
- Objective: Utilize a pre-trained NLP model to perform sentiment analysis on a given text.
- Note: This exercise assumes access to a pre-trained sentiment analysis model, such as those available in libraries like NLTK or TextBlob.
Solution:
from textblob import TextBlob
def analyze_sentiment(text):
analysis = TextBlob(text)
return analysis.sentiment
# Example Usage
text = "I love programming in Python!"
print(analyze_sentiment(text)) # Output: Sentiment(polarity=0.5, subjectivity=0.6)
Chapter 9: Practical Exercises of Deciphering Strings and Patterns
The following exercises offer practical applications of the concepts discussed in Chapter 9. They provide hands-on experience with string algorithms and text analysis techniques, reinforcing understanding and demonstrating the utility of these methods in real-world scenarios.
Exercise 1: Implement the Boyer-Moore Algorithm for Pattern Searching
- Objective: Write a function to perform the Boyer-Moore string search algorithm, an efficient method for finding substrings in larger text strings.
- Note: The Boyer-Moore algorithm is complex, focusing on the "bad character" heuristic.
Solution:
def boyer_moore_search(text, pattern):
def bad_char_heuristic(pattern):
bad_char = [-1] * 256
for i in range(len(pattern)):
bad_char[ord(pattern[i])] = i
return bad_char
m = len(pattern)
n = len(text)
bad_char = bad_char_heuristic(pattern)
s = 0
while s <= n - m:
j = m - 1
while j >= 0 and pattern[j] == text[s + j]:
j -= 1
if j < 0:
return f"Pattern occurs at shift {s}"
s += (m - bad_char[ord(text[s + m])] if s + m < n else 1)
else:
s += max(1, j - bad_char[ord(text[s + j])])
return "Pattern not found"
# Example Usage
text = "ABAAABCD"
pattern = "ABC"
print(boyer_moore_search(text, pattern)) # Output: Pattern occurs at shift 4
Exercise 2: Create a Basic Regex for Email Extraction
- Objective: Write a regex pattern to extract email addresses from a given string.
- Note: The regex should match most common email formats.
Solution:
import re
def extract_emails(text):
pattern = r'[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}'
return re.findall(pattern, text)
# Example Usage
text = "Please contact us at info@example.com or support@domain.org."
print(extract_emails(text)) # Output: ['info@example.com', 'support@domain.org']
Exercise 3: Implementing a Simple Suffix Array Construction
- Objective: Construct a suffix array for a given text string.
- Note: Suffix arrays are a space-efficient way to store and search for substrings.
Solution:
def build_suffix_array(s):
return sorted(range(len(s)), key=lambda k: s[k:])
# Example Usage
text = "banana"
print(build_suffix_array(text)) # Output: Suffix array indices for the string "banana"
Exercise 4: Sentiment Analysis Using Pre-trained Models
- Objective: Utilize a pre-trained NLP model to perform sentiment analysis on a given text.
- Note: This exercise assumes access to a pre-trained sentiment analysis model, such as those available in libraries like NLTK or TextBlob.
Solution:
from textblob import TextBlob
def analyze_sentiment(text):
analysis = TextBlob(text)
return analysis.sentiment
# Example Usage
text = "I love programming in Python!"
print(analyze_sentiment(text)) # Output: Sentiment(polarity=0.5, subjectivity=0.6)
Chapter 9: Practical Exercises of Deciphering Strings and Patterns
The following exercises offer practical applications of the concepts discussed in Chapter 9. They provide hands-on experience with string algorithms and text analysis techniques, reinforcing understanding and demonstrating the utility of these methods in real-world scenarios.
Exercise 1: Implement the Boyer-Moore Algorithm for Pattern Searching
- Objective: Write a function to perform the Boyer-Moore string search algorithm, an efficient method for finding substrings in larger text strings.
- Note: The Boyer-Moore algorithm is complex, focusing on the "bad character" heuristic.
Solution:
def boyer_moore_search(text, pattern):
def bad_char_heuristic(pattern):
bad_char = [-1] * 256
for i in range(len(pattern)):
bad_char[ord(pattern[i])] = i
return bad_char
m = len(pattern)
n = len(text)
bad_char = bad_char_heuristic(pattern)
s = 0
while s <= n - m:
j = m - 1
while j >= 0 and pattern[j] == text[s + j]:
j -= 1
if j < 0:
return f"Pattern occurs at shift {s}"
s += (m - bad_char[ord(text[s + m])] if s + m < n else 1)
else:
s += max(1, j - bad_char[ord(text[s + j])])
return "Pattern not found"
# Example Usage
text = "ABAAABCD"
pattern = "ABC"
print(boyer_moore_search(text, pattern)) # Output: Pattern occurs at shift 4
Exercise 2: Create a Basic Regex for Email Extraction
- Objective: Write a regex pattern to extract email addresses from a given string.
- Note: The regex should match most common email formats.
Solution:
import re
def extract_emails(text):
pattern = r'[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}'
return re.findall(pattern, text)
# Example Usage
text = "Please contact us at info@example.com or support@domain.org."
print(extract_emails(text)) # Output: ['info@example.com', 'support@domain.org']
Exercise 3: Implementing a Simple Suffix Array Construction
- Objective: Construct a suffix array for a given text string.
- Note: Suffix arrays are a space-efficient way to store and search for substrings.
Solution:
def build_suffix_array(s):
return sorted(range(len(s)), key=lambda k: s[k:])
# Example Usage
text = "banana"
print(build_suffix_array(text)) # Output: Suffix array indices for the string "banana"
Exercise 4: Sentiment Analysis Using Pre-trained Models
- Objective: Utilize a pre-trained NLP model to perform sentiment analysis on a given text.
- Note: This exercise assumes access to a pre-trained sentiment analysis model, such as those available in libraries like NLTK or TextBlob.
Solution:
from textblob import TextBlob
def analyze_sentiment(text):
analysis = TextBlob(text)
return analysis.sentiment
# Example Usage
text = "I love programming in Python!"
print(analyze_sentiment(text)) # Output: Sentiment(polarity=0.5, subjectivity=0.6)
Chapter 9: Practical Exercises of Deciphering Strings and Patterns
The following exercises offer practical applications of the concepts discussed in Chapter 9. They provide hands-on experience with string algorithms and text analysis techniques, reinforcing understanding and demonstrating the utility of these methods in real-world scenarios.
Exercise 1: Implement the Boyer-Moore Algorithm for Pattern Searching
- Objective: Write a function to perform the Boyer-Moore string search algorithm, an efficient method for finding substrings in larger text strings.
- Note: The Boyer-Moore algorithm is complex, focusing on the "bad character" heuristic.
Solution:
def boyer_moore_search(text, pattern):
def bad_char_heuristic(pattern):
bad_char = [-1] * 256
for i in range(len(pattern)):
bad_char[ord(pattern[i])] = i
return bad_char
m = len(pattern)
n = len(text)
bad_char = bad_char_heuristic(pattern)
s = 0
while s <= n - m:
j = m - 1
while j >= 0 and pattern[j] == text[s + j]:
j -= 1
if j < 0:
return f"Pattern occurs at shift {s}"
s += (m - bad_char[ord(text[s + m])] if s + m < n else 1)
else:
s += max(1, j - bad_char[ord(text[s + j])])
return "Pattern not found"
# Example Usage
text = "ABAAABCD"
pattern = "ABC"
print(boyer_moore_search(text, pattern)) # Output: Pattern occurs at shift 4
Exercise 2: Create a Basic Regex for Email Extraction
- Objective: Write a regex pattern to extract email addresses from a given string.
- Note: The regex should match most common email formats.
Solution:
import re
def extract_emails(text):
pattern = r'[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}'
return re.findall(pattern, text)
# Example Usage
text = "Please contact us at info@example.com or support@domain.org."
print(extract_emails(text)) # Output: ['info@example.com', 'support@domain.org']
Exercise 3: Implementing a Simple Suffix Array Construction
- Objective: Construct a suffix array for a given text string.
- Note: Suffix arrays are a space-efficient way to store and search for substrings.
Solution:
def build_suffix_array(s):
return sorted(range(len(s)), key=lambda k: s[k:])
# Example Usage
text = "banana"
print(build_suffix_array(text)) # Output: Suffix array indices for the string "banana"
Exercise 4: Sentiment Analysis Using Pre-trained Models
- Objective: Utilize a pre-trained NLP model to perform sentiment analysis on a given text.
- Note: This exercise assumes access to a pre-trained sentiment analysis model, such as those available in libraries like NLTK or TextBlob.
Solution:
from textblob import TextBlob
def analyze_sentiment(text):
analysis = TextBlob(text)
return analysis.sentiment
# Example Usage
text = "I love programming in Python!"
print(analyze_sentiment(text)) # Output: Sentiment(polarity=0.5, subjectivity=0.6)