Chapter 10: Visual Exploratory Data Analysis
10.4 Practical Exercises Chapter 10
Practical exercises are a great way for you to solidify your understanding of the concepts. Here are some exercises focusing on visual exploratory data analysis, along with their solutions.
Exercise 1: Univariate Analysis with Histograms
Task: Given a dataset of exam scores for students, plot a histogram to understand the distribution of scores.
# Sample data: Exam scores of 50 students
exam_scores = [55, 80, 74, 61, 90, 85, 68, 95, 60, 66, 70, 99, 53, 79, 62, 89, 75, 69, 94, 71, 83, 88, 57, 45, 73, 91, 76, 84, 64, 58, 98, 63, 78, 92, 82, 77, 72, 65, 59, 86, 87, 67, 46, 93, 81, 97, 54, 50, 96, 100]
# Your code here
Solution
import matplotlib.pyplot as plt
plt.hist(exam_scores, bins=10, color='blue', edgecolor='black')
plt.xlabel('Exam Scores')
plt.ylabel('Frequency')
plt.title('Distribution of Exam Scores')
plt.show()
Exercise 2: Bivariate Analysis with Scatter Plot
Task: Create a scatter plot to visualize the relationship between the heights and weights of a group of individuals.
# Sample data: Heights and weights of 10 individuals
heights = [160, 165, 170, 175, 180, 185, 190, 195, 200, 205]
weights = [55, 60, 65, 70, 75, 80, 85, 90, 95, 100]
# Your code here
Solution
plt.scatter(heights, weights, c='red', marker='o')
plt.xlabel('Heights (cm)')
plt.ylabel('Weights (kg)')
plt.title('Relationship between Heights and Weights')
plt.show()
Exercise 3: Multivariate Analysis using Heatmap
Task: Given a dataset with multiple features, create a heatmap to visualize the correlations between these features.
# Sample data: Randomly generated for 4 features
import numpy as np
import pandas as pd
import seaborn as sns
np.random.seed(42)
data = {'Feature1': np.random.randn(100),
'Feature2': np.random.randn(100),
'Feature3': np.random.randn(100),
'Feature4': np.random.randn(100)}
df = pd.DataFrame(data)
# Your code here
Solution
corr = df.corr()
sns.heatmap(corr, annot=True, cmap='coolwarm')
plt.show()
10.4 Practical Exercises Chapter 10
Practical exercises are a great way for you to solidify your understanding of the concepts. Here are some exercises focusing on visual exploratory data analysis, along with their solutions.
Exercise 1: Univariate Analysis with Histograms
Task: Given a dataset of exam scores for students, plot a histogram to understand the distribution of scores.
# Sample data: Exam scores of 50 students
exam_scores = [55, 80, 74, 61, 90, 85, 68, 95, 60, 66, 70, 99, 53, 79, 62, 89, 75, 69, 94, 71, 83, 88, 57, 45, 73, 91, 76, 84, 64, 58, 98, 63, 78, 92, 82, 77, 72, 65, 59, 86, 87, 67, 46, 93, 81, 97, 54, 50, 96, 100]
# Your code here
Solution
import matplotlib.pyplot as plt
plt.hist(exam_scores, bins=10, color='blue', edgecolor='black')
plt.xlabel('Exam Scores')
plt.ylabel('Frequency')
plt.title('Distribution of Exam Scores')
plt.show()
Exercise 2: Bivariate Analysis with Scatter Plot
Task: Create a scatter plot to visualize the relationship between the heights and weights of a group of individuals.
# Sample data: Heights and weights of 10 individuals
heights = [160, 165, 170, 175, 180, 185, 190, 195, 200, 205]
weights = [55, 60, 65, 70, 75, 80, 85, 90, 95, 100]
# Your code here
Solution
plt.scatter(heights, weights, c='red', marker='o')
plt.xlabel('Heights (cm)')
plt.ylabel('Weights (kg)')
plt.title('Relationship between Heights and Weights')
plt.show()
Exercise 3: Multivariate Analysis using Heatmap
Task: Given a dataset with multiple features, create a heatmap to visualize the correlations between these features.
# Sample data: Randomly generated for 4 features
import numpy as np
import pandas as pd
import seaborn as sns
np.random.seed(42)
data = {'Feature1': np.random.randn(100),
'Feature2': np.random.randn(100),
'Feature3': np.random.randn(100),
'Feature4': np.random.randn(100)}
df = pd.DataFrame(data)
# Your code here
Solution
corr = df.corr()
sns.heatmap(corr, annot=True, cmap='coolwarm')
plt.show()
10.4 Practical Exercises Chapter 10
Practical exercises are a great way for you to solidify your understanding of the concepts. Here are some exercises focusing on visual exploratory data analysis, along with their solutions.
Exercise 1: Univariate Analysis with Histograms
Task: Given a dataset of exam scores for students, plot a histogram to understand the distribution of scores.
# Sample data: Exam scores of 50 students
exam_scores = [55, 80, 74, 61, 90, 85, 68, 95, 60, 66, 70, 99, 53, 79, 62, 89, 75, 69, 94, 71, 83, 88, 57, 45, 73, 91, 76, 84, 64, 58, 98, 63, 78, 92, 82, 77, 72, 65, 59, 86, 87, 67, 46, 93, 81, 97, 54, 50, 96, 100]
# Your code here
Solution
import matplotlib.pyplot as plt
plt.hist(exam_scores, bins=10, color='blue', edgecolor='black')
plt.xlabel('Exam Scores')
plt.ylabel('Frequency')
plt.title('Distribution of Exam Scores')
plt.show()
Exercise 2: Bivariate Analysis with Scatter Plot
Task: Create a scatter plot to visualize the relationship between the heights and weights of a group of individuals.
# Sample data: Heights and weights of 10 individuals
heights = [160, 165, 170, 175, 180, 185, 190, 195, 200, 205]
weights = [55, 60, 65, 70, 75, 80, 85, 90, 95, 100]
# Your code here
Solution
plt.scatter(heights, weights, c='red', marker='o')
plt.xlabel('Heights (cm)')
plt.ylabel('Weights (kg)')
plt.title('Relationship between Heights and Weights')
plt.show()
Exercise 3: Multivariate Analysis using Heatmap
Task: Given a dataset with multiple features, create a heatmap to visualize the correlations between these features.
# Sample data: Randomly generated for 4 features
import numpy as np
import pandas as pd
import seaborn as sns
np.random.seed(42)
data = {'Feature1': np.random.randn(100),
'Feature2': np.random.randn(100),
'Feature3': np.random.randn(100),
'Feature4': np.random.randn(100)}
df = pd.DataFrame(data)
# Your code here
Solution
corr = df.corr()
sns.heatmap(corr, annot=True, cmap='coolwarm')
plt.show()
10.4 Practical Exercises Chapter 10
Practical exercises are a great way for you to solidify your understanding of the concepts. Here are some exercises focusing on visual exploratory data analysis, along with their solutions.
Exercise 1: Univariate Analysis with Histograms
Task: Given a dataset of exam scores for students, plot a histogram to understand the distribution of scores.
# Sample data: Exam scores of 50 students
exam_scores = [55, 80, 74, 61, 90, 85, 68, 95, 60, 66, 70, 99, 53, 79, 62, 89, 75, 69, 94, 71, 83, 88, 57, 45, 73, 91, 76, 84, 64, 58, 98, 63, 78, 92, 82, 77, 72, 65, 59, 86, 87, 67, 46, 93, 81, 97, 54, 50, 96, 100]
# Your code here
Solution
import matplotlib.pyplot as plt
plt.hist(exam_scores, bins=10, color='blue', edgecolor='black')
plt.xlabel('Exam Scores')
plt.ylabel('Frequency')
plt.title('Distribution of Exam Scores')
plt.show()
Exercise 2: Bivariate Analysis with Scatter Plot
Task: Create a scatter plot to visualize the relationship between the heights and weights of a group of individuals.
# Sample data: Heights and weights of 10 individuals
heights = [160, 165, 170, 175, 180, 185, 190, 195, 200, 205]
weights = [55, 60, 65, 70, 75, 80, 85, 90, 95, 100]
# Your code here
Solution
plt.scatter(heights, weights, c='red', marker='o')
plt.xlabel('Heights (cm)')
plt.ylabel('Weights (kg)')
plt.title('Relationship between Heights and Weights')
plt.show()
Exercise 3: Multivariate Analysis using Heatmap
Task: Given a dataset with multiple features, create a heatmap to visualize the correlations between these features.
# Sample data: Randomly generated for 4 features
import numpy as np
import pandas as pd
import seaborn as sns
np.random.seed(42)
data = {'Feature1': np.random.randn(100),
'Feature2': np.random.randn(100),
'Feature3': np.random.randn(100),
'Feature4': np.random.randn(100)}
df = pd.DataFrame(data)
# Your code here
Solution
corr = df.corr()
sns.heatmap(corr, annot=True, cmap='coolwarm')
plt.show()