Chapter 2: Python and Essential Libraries for Data Science
Practical Exercises: Chapter 2
Exercise 1: Working with NumPy Arrays
Task: Create a NumPy array with the values [10, 20, 30, 40, 50]
. Reshape it into a 2x3 array and calculate the sum of all elements.
Solution:
import numpy as np
# Create a NumPy array
array = np.array([10, 20, 30, 40, 50, 60])
# Reshape the array into 2x3
reshaped_array = array.reshape(2, 3)
# Calculate the sum of all elements
total_sum = np.sum(reshaped_array)
print("Reshaped Array:\\n", reshaped_array)
print("Total Sum:", total_sum)
Exercise 2: Basic Data Manipulation with Pandas
Task: Create a Pandas DataFrame with the following data:
Then:
- Select the
Name
andSalary
columns. - Filter rows where the
Age
is greater than 30.
Solution:
import pandas as pd
# Create a DataFrame
data = {
'Name': ['Alice', 'Bob', 'Charlie', 'David'],
'Age': [25, 30, 35, 40],
'Salary': [50000, 60000, 70000, 80000]
}
df = pd.DataFrame(data)
# Select Name and Salary columns
selected_columns = df[['Name', 'Salary']]
print("Selected Columns:\\n", selected_columns)
# Filter rows where Age is greater than 30
filtered_df = df[df['Age'] > 30]
print("\\nFiltered DataFrame (Age > 30):\\n", filtered_df)
Exercise 3: Data Visualization with Matplotlib
Task: Plot a line graph with the x-values [1, 2, 3, 4, 5]
and y-values [10, 20, 25, 40, 50]
. Add labels to the x-axis, y-axis, and a title to the graph.
Solution:
import matplotlib.pyplot as plt
# Data
x = [1, 2, 3, 4, 5]
y = [10, 20, 25, 40, 50]
# Create a line plot
plt.plot(x, y, marker='o', color='b')
# Add labels and title
plt.xlabel("X-axis")
plt.ylabel("Y-axis")
plt.title("Line Graph Example")
# Show the plot
plt.show()
Exercise 4: Visualizing Data with Seaborn
Task: Load the Iris dataset using Seaborn and create a pair plot that shows relationships between the features. Use the species
column as the hue to differentiate between species.
Solution:
import seaborn as sns
import matplotlib.pyplot as plt
# Load the Iris dataset
iris = sns.load_dataset('iris')
# Create a pair plot
sns.pairplot(iris, hue='species')
# Show the plot
plt.show()
Exercise 5: Using Scikit-learn for Classification
Task: Use the Iris dataset from Scikit-learn and train a Logistic Regression model. Split the dataset into training and testing sets (80% train, 20% test), train the model, and then evaluate its accuracy on the test set.
Solution:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
# Load the Iris dataset
iris = load_iris()
X = iris.data
y = iris.target
# Split the data into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Train the logistic regression model
model = LogisticRegression(max_iter=200)
model.fit(X_train, y_train)
# Make predictions on the test set
y_pred = model.predict(X_test)
# Calculate the accuracy
accuracy = accuracy_score(y_test, y_pred)
print(f"Model Accuracy: {accuracy:.2f}")
Exercise 6: Working with Google Colab
Task: In Google Colab, create a new notebook and write a simple TensorFlow program that checks if a GPU is available. If a GPU is found, perform a basic matrix multiplication using TensorFlow.
Solution:
import tensorflow as tf
# Check if a GPU is available
print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))
# Create two matrices and perform matrix multiplication
a = tf.constant([[1.0, 2.0], [3.0, 4.0]])
b = tf.constant([[1.0, 0.0], [0.0, 1.0]])
# Multiply the matrices
result = tf.matmul(a, b)
# Print the result
print("Result of matrix multiplication:\\n", result)
These practical exercises solidify your understanding of the key concepts discussed in Chapter 2. From working with NumPy arrays and Pandas DataFrames to visualizing data using Matplotlib and Seaborn, and finally training a machine learning model with Scikit-learn—you’ve gained hands-on experience with essential tools in data science. Additionally, you’ve learned how to leverage Google Colab to use cloud resources for machine learning experiments.
Practical Exercises: Chapter 2
Exercise 1: Working with NumPy Arrays
Task: Create a NumPy array with the values [10, 20, 30, 40, 50]
. Reshape it into a 2x3 array and calculate the sum of all elements.
Solution:
import numpy as np
# Create a NumPy array
array = np.array([10, 20, 30, 40, 50, 60])
# Reshape the array into 2x3
reshaped_array = array.reshape(2, 3)
# Calculate the sum of all elements
total_sum = np.sum(reshaped_array)
print("Reshaped Array:\\n", reshaped_array)
print("Total Sum:", total_sum)
Exercise 2: Basic Data Manipulation with Pandas
Task: Create a Pandas DataFrame with the following data:
Then:
- Select the
Name
andSalary
columns. - Filter rows where the
Age
is greater than 30.
Solution:
import pandas as pd
# Create a DataFrame
data = {
'Name': ['Alice', 'Bob', 'Charlie', 'David'],
'Age': [25, 30, 35, 40],
'Salary': [50000, 60000, 70000, 80000]
}
df = pd.DataFrame(data)
# Select Name and Salary columns
selected_columns = df[['Name', 'Salary']]
print("Selected Columns:\\n", selected_columns)
# Filter rows where Age is greater than 30
filtered_df = df[df['Age'] > 30]
print("\\nFiltered DataFrame (Age > 30):\\n", filtered_df)
Exercise 3: Data Visualization with Matplotlib
Task: Plot a line graph with the x-values [1, 2, 3, 4, 5]
and y-values [10, 20, 25, 40, 50]
. Add labels to the x-axis, y-axis, and a title to the graph.
Solution:
import matplotlib.pyplot as plt
# Data
x = [1, 2, 3, 4, 5]
y = [10, 20, 25, 40, 50]
# Create a line plot
plt.plot(x, y, marker='o', color='b')
# Add labels and title
plt.xlabel("X-axis")
plt.ylabel("Y-axis")
plt.title("Line Graph Example")
# Show the plot
plt.show()
Exercise 4: Visualizing Data with Seaborn
Task: Load the Iris dataset using Seaborn and create a pair plot that shows relationships between the features. Use the species
column as the hue to differentiate between species.
Solution:
import seaborn as sns
import matplotlib.pyplot as plt
# Load the Iris dataset
iris = sns.load_dataset('iris')
# Create a pair plot
sns.pairplot(iris, hue='species')
# Show the plot
plt.show()
Exercise 5: Using Scikit-learn for Classification
Task: Use the Iris dataset from Scikit-learn and train a Logistic Regression model. Split the dataset into training and testing sets (80% train, 20% test), train the model, and then evaluate its accuracy on the test set.
Solution:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
# Load the Iris dataset
iris = load_iris()
X = iris.data
y = iris.target
# Split the data into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Train the logistic regression model
model = LogisticRegression(max_iter=200)
model.fit(X_train, y_train)
# Make predictions on the test set
y_pred = model.predict(X_test)
# Calculate the accuracy
accuracy = accuracy_score(y_test, y_pred)
print(f"Model Accuracy: {accuracy:.2f}")
Exercise 6: Working with Google Colab
Task: In Google Colab, create a new notebook and write a simple TensorFlow program that checks if a GPU is available. If a GPU is found, perform a basic matrix multiplication using TensorFlow.
Solution:
import tensorflow as tf
# Check if a GPU is available
print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))
# Create two matrices and perform matrix multiplication
a = tf.constant([[1.0, 2.0], [3.0, 4.0]])
b = tf.constant([[1.0, 0.0], [0.0, 1.0]])
# Multiply the matrices
result = tf.matmul(a, b)
# Print the result
print("Result of matrix multiplication:\\n", result)
These practical exercises solidify your understanding of the key concepts discussed in Chapter 2. From working with NumPy arrays and Pandas DataFrames to visualizing data using Matplotlib and Seaborn, and finally training a machine learning model with Scikit-learn—you’ve gained hands-on experience with essential tools in data science. Additionally, you’ve learned how to leverage Google Colab to use cloud resources for machine learning experiments.
Practical Exercises: Chapter 2
Exercise 1: Working with NumPy Arrays
Task: Create a NumPy array with the values [10, 20, 30, 40, 50]
. Reshape it into a 2x3 array and calculate the sum of all elements.
Solution:
import numpy as np
# Create a NumPy array
array = np.array([10, 20, 30, 40, 50, 60])
# Reshape the array into 2x3
reshaped_array = array.reshape(2, 3)
# Calculate the sum of all elements
total_sum = np.sum(reshaped_array)
print("Reshaped Array:\\n", reshaped_array)
print("Total Sum:", total_sum)
Exercise 2: Basic Data Manipulation with Pandas
Task: Create a Pandas DataFrame with the following data:
Then:
- Select the
Name
andSalary
columns. - Filter rows where the
Age
is greater than 30.
Solution:
import pandas as pd
# Create a DataFrame
data = {
'Name': ['Alice', 'Bob', 'Charlie', 'David'],
'Age': [25, 30, 35, 40],
'Salary': [50000, 60000, 70000, 80000]
}
df = pd.DataFrame(data)
# Select Name and Salary columns
selected_columns = df[['Name', 'Salary']]
print("Selected Columns:\\n", selected_columns)
# Filter rows where Age is greater than 30
filtered_df = df[df['Age'] > 30]
print("\\nFiltered DataFrame (Age > 30):\\n", filtered_df)
Exercise 3: Data Visualization with Matplotlib
Task: Plot a line graph with the x-values [1, 2, 3, 4, 5]
and y-values [10, 20, 25, 40, 50]
. Add labels to the x-axis, y-axis, and a title to the graph.
Solution:
import matplotlib.pyplot as plt
# Data
x = [1, 2, 3, 4, 5]
y = [10, 20, 25, 40, 50]
# Create a line plot
plt.plot(x, y, marker='o', color='b')
# Add labels and title
plt.xlabel("X-axis")
plt.ylabel("Y-axis")
plt.title("Line Graph Example")
# Show the plot
plt.show()
Exercise 4: Visualizing Data with Seaborn
Task: Load the Iris dataset using Seaborn and create a pair plot that shows relationships between the features. Use the species
column as the hue to differentiate between species.
Solution:
import seaborn as sns
import matplotlib.pyplot as plt
# Load the Iris dataset
iris = sns.load_dataset('iris')
# Create a pair plot
sns.pairplot(iris, hue='species')
# Show the plot
plt.show()
Exercise 5: Using Scikit-learn for Classification
Task: Use the Iris dataset from Scikit-learn and train a Logistic Regression model. Split the dataset into training and testing sets (80% train, 20% test), train the model, and then evaluate its accuracy on the test set.
Solution:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
# Load the Iris dataset
iris = load_iris()
X = iris.data
y = iris.target
# Split the data into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Train the logistic regression model
model = LogisticRegression(max_iter=200)
model.fit(X_train, y_train)
# Make predictions on the test set
y_pred = model.predict(X_test)
# Calculate the accuracy
accuracy = accuracy_score(y_test, y_pred)
print(f"Model Accuracy: {accuracy:.2f}")
Exercise 6: Working with Google Colab
Task: In Google Colab, create a new notebook and write a simple TensorFlow program that checks if a GPU is available. If a GPU is found, perform a basic matrix multiplication using TensorFlow.
Solution:
import tensorflow as tf
# Check if a GPU is available
print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))
# Create two matrices and perform matrix multiplication
a = tf.constant([[1.0, 2.0], [3.0, 4.0]])
b = tf.constant([[1.0, 0.0], [0.0, 1.0]])
# Multiply the matrices
result = tf.matmul(a, b)
# Print the result
print("Result of matrix multiplication:\\n", result)
These practical exercises solidify your understanding of the key concepts discussed in Chapter 2. From working with NumPy arrays and Pandas DataFrames to visualizing data using Matplotlib and Seaborn, and finally training a machine learning model with Scikit-learn—you’ve gained hands-on experience with essential tools in data science. Additionally, you’ve learned how to leverage Google Colab to use cloud resources for machine learning experiments.
Practical Exercises: Chapter 2
Exercise 1: Working with NumPy Arrays
Task: Create a NumPy array with the values [10, 20, 30, 40, 50]
. Reshape it into a 2x3 array and calculate the sum of all elements.
Solution:
import numpy as np
# Create a NumPy array
array = np.array([10, 20, 30, 40, 50, 60])
# Reshape the array into 2x3
reshaped_array = array.reshape(2, 3)
# Calculate the sum of all elements
total_sum = np.sum(reshaped_array)
print("Reshaped Array:\\n", reshaped_array)
print("Total Sum:", total_sum)
Exercise 2: Basic Data Manipulation with Pandas
Task: Create a Pandas DataFrame with the following data:
Then:
- Select the
Name
andSalary
columns. - Filter rows where the
Age
is greater than 30.
Solution:
import pandas as pd
# Create a DataFrame
data = {
'Name': ['Alice', 'Bob', 'Charlie', 'David'],
'Age': [25, 30, 35, 40],
'Salary': [50000, 60000, 70000, 80000]
}
df = pd.DataFrame(data)
# Select Name and Salary columns
selected_columns = df[['Name', 'Salary']]
print("Selected Columns:\\n", selected_columns)
# Filter rows where Age is greater than 30
filtered_df = df[df['Age'] > 30]
print("\\nFiltered DataFrame (Age > 30):\\n", filtered_df)
Exercise 3: Data Visualization with Matplotlib
Task: Plot a line graph with the x-values [1, 2, 3, 4, 5]
and y-values [10, 20, 25, 40, 50]
. Add labels to the x-axis, y-axis, and a title to the graph.
Solution:
import matplotlib.pyplot as plt
# Data
x = [1, 2, 3, 4, 5]
y = [10, 20, 25, 40, 50]
# Create a line plot
plt.plot(x, y, marker='o', color='b')
# Add labels and title
plt.xlabel("X-axis")
plt.ylabel("Y-axis")
plt.title("Line Graph Example")
# Show the plot
plt.show()
Exercise 4: Visualizing Data with Seaborn
Task: Load the Iris dataset using Seaborn and create a pair plot that shows relationships between the features. Use the species
column as the hue to differentiate between species.
Solution:
import seaborn as sns
import matplotlib.pyplot as plt
# Load the Iris dataset
iris = sns.load_dataset('iris')
# Create a pair plot
sns.pairplot(iris, hue='species')
# Show the plot
plt.show()
Exercise 5: Using Scikit-learn for Classification
Task: Use the Iris dataset from Scikit-learn and train a Logistic Regression model. Split the dataset into training and testing sets (80% train, 20% test), train the model, and then evaluate its accuracy on the test set.
Solution:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
# Load the Iris dataset
iris = load_iris()
X = iris.data
y = iris.target
# Split the data into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Train the logistic regression model
model = LogisticRegression(max_iter=200)
model.fit(X_train, y_train)
# Make predictions on the test set
y_pred = model.predict(X_test)
# Calculate the accuracy
accuracy = accuracy_score(y_test, y_pred)
print(f"Model Accuracy: {accuracy:.2f}")
Exercise 6: Working with Google Colab
Task: In Google Colab, create a new notebook and write a simple TensorFlow program that checks if a GPU is available. If a GPU is found, perform a basic matrix multiplication using TensorFlow.
Solution:
import tensorflow as tf
# Check if a GPU is available
print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))
# Create two matrices and perform matrix multiplication
a = tf.constant([[1.0, 2.0], [3.0, 4.0]])
b = tf.constant([[1.0, 0.0], [0.0, 1.0]])
# Multiply the matrices
result = tf.matmul(a, b)
# Print the result
print("Result of matrix multiplication:\\n", result)
These practical exercises solidify your understanding of the key concepts discussed in Chapter 2. From working with NumPy arrays and Pandas DataFrames to visualizing data using Matplotlib and Seaborn, and finally training a machine learning model with Scikit-learn—you’ve gained hands-on experience with essential tools in data science. Additionally, you’ve learned how to leverage Google Colab to use cloud resources for machine learning experiments.