Menu iconMenu iconData Analysis Foundations with Python
Data Analysis Foundations with Python

Chapter 8: Understanding EDA

8.4 Practical Exercises for Chapter 8: Understanding EDA

Exercise 1: Understanding the Importance of EDA

Load a dataset of your choice. Perform initial explorations like .head().info() and .describe() to understand the data.

import pandas as pd

# Example Solution:
df = pd.read_csv('your_dataset.csv')
print(df.head())
print(df.info())
print(df.describe())

Download here the your_dataset.csv file

Exercise 2: Identifying Types of Data

Identify at least two columns in your dataset which contain categorical data and two which contain numerical data.

# Example Solution:
# Categorical: 'Gender', 'Country'
# Numerical: 'Age', 'Income'

Exercise 3: Calculating Descriptive Statistics

Calculate the mean, median, and standard deviation of a numerical column in your dataset.

# Example Solution:
mean_age = df['Age'].mean()
median_age = df['Age'].median()
std_age = df['Age'].std()

print(f"Mean Age: {mean_age}")
print(f"Median Age: {median_age}")
print(f"Standard Deviation of Age: {std_age}")

Exercise 4: Understanding Skewness and Kurtosis

Compute the skewness and kurtosis for a numerical column in your dataset.

# Example Solution:
skewness = df['Income'].skew()
kurtosis = df['Income'].kurt()

print(f"Skewness of Income: {skewness}")
print(f"Kurtosis of Income: {kurtosis}")

8.4 Practical Exercises for Chapter 8: Understanding EDA

Exercise 1: Understanding the Importance of EDA

Load a dataset of your choice. Perform initial explorations like .head().info() and .describe() to understand the data.

import pandas as pd

# Example Solution:
df = pd.read_csv('your_dataset.csv')
print(df.head())
print(df.info())
print(df.describe())

Download here the your_dataset.csv file

Exercise 2: Identifying Types of Data

Identify at least two columns in your dataset which contain categorical data and two which contain numerical data.

# Example Solution:
# Categorical: 'Gender', 'Country'
# Numerical: 'Age', 'Income'

Exercise 3: Calculating Descriptive Statistics

Calculate the mean, median, and standard deviation of a numerical column in your dataset.

# Example Solution:
mean_age = df['Age'].mean()
median_age = df['Age'].median()
std_age = df['Age'].std()

print(f"Mean Age: {mean_age}")
print(f"Median Age: {median_age}")
print(f"Standard Deviation of Age: {std_age}")

Exercise 4: Understanding Skewness and Kurtosis

Compute the skewness and kurtosis for a numerical column in your dataset.

# Example Solution:
skewness = df['Income'].skew()
kurtosis = df['Income'].kurt()

print(f"Skewness of Income: {skewness}")
print(f"Kurtosis of Income: {kurtosis}")

8.4 Practical Exercises for Chapter 8: Understanding EDA

Exercise 1: Understanding the Importance of EDA

Load a dataset of your choice. Perform initial explorations like .head().info() and .describe() to understand the data.

import pandas as pd

# Example Solution:
df = pd.read_csv('your_dataset.csv')
print(df.head())
print(df.info())
print(df.describe())

Download here the your_dataset.csv file

Exercise 2: Identifying Types of Data

Identify at least two columns in your dataset which contain categorical data and two which contain numerical data.

# Example Solution:
# Categorical: 'Gender', 'Country'
# Numerical: 'Age', 'Income'

Exercise 3: Calculating Descriptive Statistics

Calculate the mean, median, and standard deviation of a numerical column in your dataset.

# Example Solution:
mean_age = df['Age'].mean()
median_age = df['Age'].median()
std_age = df['Age'].std()

print(f"Mean Age: {mean_age}")
print(f"Median Age: {median_age}")
print(f"Standard Deviation of Age: {std_age}")

Exercise 4: Understanding Skewness and Kurtosis

Compute the skewness and kurtosis for a numerical column in your dataset.

# Example Solution:
skewness = df['Income'].skew()
kurtosis = df['Income'].kurt()

print(f"Skewness of Income: {skewness}")
print(f"Kurtosis of Income: {kurtosis}")

8.4 Practical Exercises for Chapter 8: Understanding EDA

Exercise 1: Understanding the Importance of EDA

Load a dataset of your choice. Perform initial explorations like .head().info() and .describe() to understand the data.

import pandas as pd

# Example Solution:
df = pd.read_csv('your_dataset.csv')
print(df.head())
print(df.info())
print(df.describe())

Download here the your_dataset.csv file

Exercise 2: Identifying Types of Data

Identify at least two columns in your dataset which contain categorical data and two which contain numerical data.

# Example Solution:
# Categorical: 'Gender', 'Country'
# Numerical: 'Age', 'Income'

Exercise 3: Calculating Descriptive Statistics

Calculate the mean, median, and standard deviation of a numerical column in your dataset.

# Example Solution:
mean_age = df['Age'].mean()
median_age = df['Age'].median()
std_age = df['Age'].std()

print(f"Mean Age: {mean_age}")
print(f"Median Age: {median_age}")
print(f"Standard Deviation of Age: {std_age}")

Exercise 4: Understanding Skewness and Kurtosis

Compute the skewness and kurtosis for a numerical column in your dataset.

# Example Solution:
skewness = df['Income'].skew()
kurtosis = df['Income'].kurt()

print(f"Skewness of Income: {skewness}")
print(f"Kurtosis of Income: {kurtosis}")