# Chapter 8: Understanding EDA

## 8.4 Practical Exercises for Chapter 8: Understanding EDA

### Exercise 1: Understanding the Importance of EDA

Load a dataset of your choice. Perform initial explorations like `.head()`

, `.info()`

and `.describe()`

to understand the data.

`import pandas as pd`

# Example Solution:

df = pd.read_csv('your_dataset.csv')

print(df.head())

print(df.info())

print(df.describe())

Download here the your_dataset.csv file

### Exercise 2: Identifying Types of Data

Identify at least two columns in your dataset which contain categorical data and two which contain numerical data.

`# Example Solution:`

# Categorical: 'Gender', 'Country'

# Numerical: 'Age', 'Income'

### Exercise 3: Calculating Descriptive Statistics

Calculate the mean, median, and standard deviation of a numerical column in your dataset.

`# Example Solution:`

mean_age = df['Age'].mean()

median_age = df['Age'].median()

std_age = df['Age'].std()

print(f"Mean Age: {mean_age}")

print(f"Median Age: {median_age}")

print(f"Standard Deviation of Age: {std_age}")

### Exercise 4: Understanding Skewness and Kurtosis

Compute the skewness and kurtosis for a numerical column in your dataset.

`# Example Solution:`

skewness = df['Income'].skew()

kurtosis = df['Income'].kurt()

print(f"Skewness of Income: {skewness}")

print(f"Kurtosis of Income: {kurtosis}")

## 8.4 Practical Exercises for Chapter 8: Understanding EDA

### Exercise 1: Understanding the Importance of EDA

Load a dataset of your choice. Perform initial explorations like `.head()`

, `.info()`

and `.describe()`

to understand the data.

`import pandas as pd`

# Example Solution:

df = pd.read_csv('your_dataset.csv')

print(df.head())

print(df.info())

print(df.describe())

Download here the your_dataset.csv file

### Exercise 2: Identifying Types of Data

Identify at least two columns in your dataset which contain categorical data and two which contain numerical data.

`# Example Solution:`

# Categorical: 'Gender', 'Country'

# Numerical: 'Age', 'Income'

### Exercise 3: Calculating Descriptive Statistics

Calculate the mean, median, and standard deviation of a numerical column in your dataset.

`# Example Solution:`

mean_age = df['Age'].mean()

median_age = df['Age'].median()

std_age = df['Age'].std()

print(f"Mean Age: {mean_age}")

print(f"Median Age: {median_age}")

print(f"Standard Deviation of Age: {std_age}")

### Exercise 4: Understanding Skewness and Kurtosis

Compute the skewness and kurtosis for a numerical column in your dataset.

`# Example Solution:`

skewness = df['Income'].skew()

kurtosis = df['Income'].kurt()

print(f"Skewness of Income: {skewness}")

print(f"Kurtosis of Income: {kurtosis}")

## 8.4 Practical Exercises for Chapter 8: Understanding EDA

### Exercise 1: Understanding the Importance of EDA

Load a dataset of your choice. Perform initial explorations like `.head()`

, `.info()`

and `.describe()`

to understand the data.

`import pandas as pd`

# Example Solution:

df = pd.read_csv('your_dataset.csv')

print(df.head())

print(df.info())

print(df.describe())

Download here the your_dataset.csv file

### Exercise 2: Identifying Types of Data

Identify at least two columns in your dataset which contain categorical data and two which contain numerical data.

`# Example Solution:`

# Categorical: 'Gender', 'Country'

# Numerical: 'Age', 'Income'

### Exercise 3: Calculating Descriptive Statistics

Calculate the mean, median, and standard deviation of a numerical column in your dataset.

`# Example Solution:`

mean_age = df['Age'].mean()

median_age = df['Age'].median()

std_age = df['Age'].std()

print(f"Mean Age: {mean_age}")

print(f"Median Age: {median_age}")

print(f"Standard Deviation of Age: {std_age}")

### Exercise 4: Understanding Skewness and Kurtosis

Compute the skewness and kurtosis for a numerical column in your dataset.

`# Example Solution:`

skewness = df['Income'].skew()

kurtosis = df['Income'].kurt()

print(f"Skewness of Income: {skewness}")

print(f"Kurtosis of Income: {kurtosis}")

## 8.4 Practical Exercises for Chapter 8: Understanding EDA

### Exercise 1: Understanding the Importance of EDA

`.head()`

, `.info()`

and `.describe()`

to understand the data.

`import pandas as pd`

# Example Solution:

df = pd.read_csv('your_dataset.csv')

print(df.head())

print(df.info())

print(df.describe())

Download here the your_dataset.csv file

### Exercise 2: Identifying Types of Data

`# Example Solution:`

# Categorical: 'Gender', 'Country'

# Numerical: 'Age', 'Income'

### Exercise 3: Calculating Descriptive Statistics

Calculate the mean, median, and standard deviation of a numerical column in your dataset.

`# Example Solution:`

mean_age = df['Age'].mean()

median_age = df['Age'].median()

std_age = df['Age'].std()

print(f"Mean Age: {mean_age}")

print(f"Median Age: {median_age}")

print(f"Standard Deviation of Age: {std_age}")

### Exercise 4: Understanding Skewness and Kurtosis

Compute the skewness and kurtosis for a numerical column in your dataset.

`# Example Solution:`

skewness = df['Income'].skew()

kurtosis = df['Income'].kurt()

print(f"Skewness of Income: {skewness}")

print(f"Kurtosis of Income: {kurtosis}")