Menu iconMenu iconData Analysis Foundations with Python
Data Analysis Foundations with Python

Project 3: Capstone Project: Building a Recommender System

Model Building

In the vast and ever-evolving field of recommender systems, there are numerous approaches you can explore to enhance user experiences and drive personalized recommendations. Some of the popular methods include collaborative filtering, content-based filtering, and hybrid models that combine the strengths of both. 

For the purpose of this project, we will delve into the fascinating realm of collaborative filtering, utilizing the powerful capabilities of Python's scikit-surprise library. By leveraging collaborative filtering, we can effectively match users with similar tastes and preferences, consequently providing them with tailored suggestions that align with their unique interests and preferences.

Installation and Importing Libraries

First, install the necessary packages:

pip install numpy pandas scikit-surprise

Now, let's import the libraries:

from surprise import Reader, Dataset, SVD
from surprise.model_selection import cross_validate
import pandas as pd

Preparing Data for the Model

The scikit-surprise library provides convenient data loading methods that are fully compatible with Pandas DataFrames. This compatibility allows for seamless integration and easy manipulation of data. With these built-in data loading methods, users can quickly and efficiently load their data into the library and begin their analysis without any hassle.

By leveraging the power of Pandas DataFrames, the scikit-surprise library ensures that users have a wide range of data manipulation capabilities at their disposal, enabling them to explore and analyze their data in a more comprehensive and insightful manner.

Let's load our data:

# Read the dataset
df = pd.read_csv('product_interactions.csv')

# Define the reader object and parse the dataframe
reader = Reader(rating_scale=(1, 5))
data = Dataset.load_from_df(df[['user_id', 'product_id', 'rating']], reader)

Download here the product_interactions.csv file

Building the SVD Model

Singular Value Decomposition (SVD) is widely recognized as one of the most effective and widely used algorithms for collaborative filtering. It has gained significant popularity and is considered a go-to method for recommendation systems due to its ability to accurately predict user preferences based on past behavior.

By decomposing the user-item rating matrix into three matrices, SVD is able to capture latent factors that can reveal hidden patterns and relationships among users and items. This comprehensive approach allows SVD to provide highly personalized recommendations, making it a valuable tool for enhancing user experience and driving customer engagement.

Let's build and evaluate our SVD model:

# Create an SVD model object
model = SVD()

# Cross-validate the model
cv_results = cross_validate(model, data, measures=['RMSE', 'MAE'], cv=5, verbose=True)

This will print the Root Mean Square Error (RMSE) and Mean Absolute Error (MAE) for our model, averaged over 5-fold cross-validation.

Making Predictions

Finally, let's make some recommendations:

# Fit the model to the dataset
trainset = data.build_full_trainset()
model.fit(trainset)

# Making predictions for a user (let's say user_id=1)
user_id = 1
preds = []

for product_id in df['product_id'].unique():
    pred_rating = model.predict(user_id, product_id).est
    preds.append((product_id, pred_rating))

# Sort predictions and pick top 5
top_5_preds = sorted(preds, key=lambda x: x[1], reverse=True)[:5]

print("Top 5 product recommendations for user 1: ", [x[0] for x in top_5_preds])

And there you have it! You've just built your own product recommender system. 

Remember, this model is relatively basic but serves as a robust starting point. As you learn more about your data and requirements, you can tune the model or even try other advanced algorithms.

Isn't it delightful to see how lines of code can turn into intelligent systems that can make our lives easier? As you continue your AI Engineering journey, this is just one of the many fascinating projects you'll encounter. If you enjoyed this, make sure to check out our other books and the comprehensive "AI Engineering Journey" that includes a variety of topics to further enhance your skills. 

Now, let's bring this project to its culmination with the Evaluation and Deployment phase. You have defined your problem, gathered and preprocessed data, and even built a working model. What comes next? Well, before your recommender system goes live, you need to rigorously evaluate its performance and prepare it for deployment. Let's dive into it.

Model Building

In the vast and ever-evolving field of recommender systems, there are numerous approaches you can explore to enhance user experiences and drive personalized recommendations. Some of the popular methods include collaborative filtering, content-based filtering, and hybrid models that combine the strengths of both. 

For the purpose of this project, we will delve into the fascinating realm of collaborative filtering, utilizing the powerful capabilities of Python's scikit-surprise library. By leveraging collaborative filtering, we can effectively match users with similar tastes and preferences, consequently providing them with tailored suggestions that align with their unique interests and preferences.

Installation and Importing Libraries

First, install the necessary packages:

pip install numpy pandas scikit-surprise

Now, let's import the libraries:

from surprise import Reader, Dataset, SVD
from surprise.model_selection import cross_validate
import pandas as pd

Preparing Data for the Model

The scikit-surprise library provides convenient data loading methods that are fully compatible with Pandas DataFrames. This compatibility allows for seamless integration and easy manipulation of data. With these built-in data loading methods, users can quickly and efficiently load their data into the library and begin their analysis without any hassle.

By leveraging the power of Pandas DataFrames, the scikit-surprise library ensures that users have a wide range of data manipulation capabilities at their disposal, enabling them to explore and analyze their data in a more comprehensive and insightful manner.

Let's load our data:

# Read the dataset
df = pd.read_csv('product_interactions.csv')

# Define the reader object and parse the dataframe
reader = Reader(rating_scale=(1, 5))
data = Dataset.load_from_df(df[['user_id', 'product_id', 'rating']], reader)

Download here the product_interactions.csv file

Building the SVD Model

Singular Value Decomposition (SVD) is widely recognized as one of the most effective and widely used algorithms for collaborative filtering. It has gained significant popularity and is considered a go-to method for recommendation systems due to its ability to accurately predict user preferences based on past behavior.

By decomposing the user-item rating matrix into three matrices, SVD is able to capture latent factors that can reveal hidden patterns and relationships among users and items. This comprehensive approach allows SVD to provide highly personalized recommendations, making it a valuable tool for enhancing user experience and driving customer engagement.

Let's build and evaluate our SVD model:

# Create an SVD model object
model = SVD()

# Cross-validate the model
cv_results = cross_validate(model, data, measures=['RMSE', 'MAE'], cv=5, verbose=True)

This will print the Root Mean Square Error (RMSE) and Mean Absolute Error (MAE) for our model, averaged over 5-fold cross-validation.

Making Predictions

Finally, let's make some recommendations:

# Fit the model to the dataset
trainset = data.build_full_trainset()
model.fit(trainset)

# Making predictions for a user (let's say user_id=1)
user_id = 1
preds = []

for product_id in df['product_id'].unique():
    pred_rating = model.predict(user_id, product_id).est
    preds.append((product_id, pred_rating))

# Sort predictions and pick top 5
top_5_preds = sorted(preds, key=lambda x: x[1], reverse=True)[:5]

print("Top 5 product recommendations for user 1: ", [x[0] for x in top_5_preds])

And there you have it! You've just built your own product recommender system. 

Remember, this model is relatively basic but serves as a robust starting point. As you learn more about your data and requirements, you can tune the model or even try other advanced algorithms.

Isn't it delightful to see how lines of code can turn into intelligent systems that can make our lives easier? As you continue your AI Engineering journey, this is just one of the many fascinating projects you'll encounter. If you enjoyed this, make sure to check out our other books and the comprehensive "AI Engineering Journey" that includes a variety of topics to further enhance your skills. 

Now, let's bring this project to its culmination with the Evaluation and Deployment phase. You have defined your problem, gathered and preprocessed data, and even built a working model. What comes next? Well, before your recommender system goes live, you need to rigorously evaluate its performance and prepare it for deployment. Let's dive into it.

Model Building

In the vast and ever-evolving field of recommender systems, there are numerous approaches you can explore to enhance user experiences and drive personalized recommendations. Some of the popular methods include collaborative filtering, content-based filtering, and hybrid models that combine the strengths of both. 

For the purpose of this project, we will delve into the fascinating realm of collaborative filtering, utilizing the powerful capabilities of Python's scikit-surprise library. By leveraging collaborative filtering, we can effectively match users with similar tastes and preferences, consequently providing them with tailored suggestions that align with their unique interests and preferences.

Installation and Importing Libraries

First, install the necessary packages:

pip install numpy pandas scikit-surprise

Now, let's import the libraries:

from surprise import Reader, Dataset, SVD
from surprise.model_selection import cross_validate
import pandas as pd

Preparing Data for the Model

The scikit-surprise library provides convenient data loading methods that are fully compatible with Pandas DataFrames. This compatibility allows for seamless integration and easy manipulation of data. With these built-in data loading methods, users can quickly and efficiently load their data into the library and begin their analysis without any hassle.

By leveraging the power of Pandas DataFrames, the scikit-surprise library ensures that users have a wide range of data manipulation capabilities at their disposal, enabling them to explore and analyze their data in a more comprehensive and insightful manner.

Let's load our data:

# Read the dataset
df = pd.read_csv('product_interactions.csv')

# Define the reader object and parse the dataframe
reader = Reader(rating_scale=(1, 5))
data = Dataset.load_from_df(df[['user_id', 'product_id', 'rating']], reader)

Download here the product_interactions.csv file

Building the SVD Model

Singular Value Decomposition (SVD) is widely recognized as one of the most effective and widely used algorithms for collaborative filtering. It has gained significant popularity and is considered a go-to method for recommendation systems due to its ability to accurately predict user preferences based on past behavior.

By decomposing the user-item rating matrix into three matrices, SVD is able to capture latent factors that can reveal hidden patterns and relationships among users and items. This comprehensive approach allows SVD to provide highly personalized recommendations, making it a valuable tool for enhancing user experience and driving customer engagement.

Let's build and evaluate our SVD model:

# Create an SVD model object
model = SVD()

# Cross-validate the model
cv_results = cross_validate(model, data, measures=['RMSE', 'MAE'], cv=5, verbose=True)

This will print the Root Mean Square Error (RMSE) and Mean Absolute Error (MAE) for our model, averaged over 5-fold cross-validation.

Making Predictions

Finally, let's make some recommendations:

# Fit the model to the dataset
trainset = data.build_full_trainset()
model.fit(trainset)

# Making predictions for a user (let's say user_id=1)
user_id = 1
preds = []

for product_id in df['product_id'].unique():
    pred_rating = model.predict(user_id, product_id).est
    preds.append((product_id, pred_rating))

# Sort predictions and pick top 5
top_5_preds = sorted(preds, key=lambda x: x[1], reverse=True)[:5]

print("Top 5 product recommendations for user 1: ", [x[0] for x in top_5_preds])

And there you have it! You've just built your own product recommender system. 

Remember, this model is relatively basic but serves as a robust starting point. As you learn more about your data and requirements, you can tune the model or even try other advanced algorithms.

Isn't it delightful to see how lines of code can turn into intelligent systems that can make our lives easier? As you continue your AI Engineering journey, this is just one of the many fascinating projects you'll encounter. If you enjoyed this, make sure to check out our other books and the comprehensive "AI Engineering Journey" that includes a variety of topics to further enhance your skills. 

Now, let's bring this project to its culmination with the Evaluation and Deployment phase. You have defined your problem, gathered and preprocessed data, and even built a working model. What comes next? Well, before your recommender system goes live, you need to rigorously evaluate its performance and prepare it for deployment. Let's dive into it.

Model Building

In the vast and ever-evolving field of recommender systems, there are numerous approaches you can explore to enhance user experiences and drive personalized recommendations. Some of the popular methods include collaborative filtering, content-based filtering, and hybrid models that combine the strengths of both. 

For the purpose of this project, we will delve into the fascinating realm of collaborative filtering, utilizing the powerful capabilities of Python's scikit-surprise library. By leveraging collaborative filtering, we can effectively match users with similar tastes and preferences, consequently providing them with tailored suggestions that align with their unique interests and preferences.

Installation and Importing Libraries

First, install the necessary packages:

pip install numpy pandas scikit-surprise

Now, let's import the libraries:

from surprise import Reader, Dataset, SVD
from surprise.model_selection import cross_validate
import pandas as pd

Preparing Data for the Model

The scikit-surprise library provides convenient data loading methods that are fully compatible with Pandas DataFrames. This compatibility allows for seamless integration and easy manipulation of data. With these built-in data loading methods, users can quickly and efficiently load their data into the library and begin their analysis without any hassle.

By leveraging the power of Pandas DataFrames, the scikit-surprise library ensures that users have a wide range of data manipulation capabilities at their disposal, enabling them to explore and analyze their data in a more comprehensive and insightful manner.

Let's load our data:

# Read the dataset
df = pd.read_csv('product_interactions.csv')

# Define the reader object and parse the dataframe
reader = Reader(rating_scale=(1, 5))
data = Dataset.load_from_df(df[['user_id', 'product_id', 'rating']], reader)

Download here the product_interactions.csv file

Building the SVD Model

Singular Value Decomposition (SVD) is widely recognized as one of the most effective and widely used algorithms for collaborative filtering. It has gained significant popularity and is considered a go-to method for recommendation systems due to its ability to accurately predict user preferences based on past behavior.

By decomposing the user-item rating matrix into three matrices, SVD is able to capture latent factors that can reveal hidden patterns and relationships among users and items. This comprehensive approach allows SVD to provide highly personalized recommendations, making it a valuable tool for enhancing user experience and driving customer engagement.

Let's build and evaluate our SVD model:

# Create an SVD model object
model = SVD()

# Cross-validate the model
cv_results = cross_validate(model, data, measures=['RMSE', 'MAE'], cv=5, verbose=True)

This will print the Root Mean Square Error (RMSE) and Mean Absolute Error (MAE) for our model, averaged over 5-fold cross-validation.

Making Predictions

Finally, let's make some recommendations:

# Fit the model to the dataset
trainset = data.build_full_trainset()
model.fit(trainset)

# Making predictions for a user (let's say user_id=1)
user_id = 1
preds = []

for product_id in df['product_id'].unique():
    pred_rating = model.predict(user_id, product_id).est
    preds.append((product_id, pred_rating))

# Sort predictions and pick top 5
top_5_preds = sorted(preds, key=lambda x: x[1], reverse=True)[:5]

print("Top 5 product recommendations for user 1: ", [x[0] for x in top_5_preds])

And there you have it! You've just built your own product recommender system. 

Remember, this model is relatively basic but serves as a robust starting point. As you learn more about your data and requirements, you can tune the model or even try other advanced algorithms.

Isn't it delightful to see how lines of code can turn into intelligent systems that can make our lives easier? As you continue your AI Engineering journey, this is just one of the many fascinating projects you'll encounter. If you enjoyed this, make sure to check out our other books and the comprehensive "AI Engineering Journey" that includes a variety of topics to further enhance your skills. 

Now, let's bring this project to its culmination with the Evaluation and Deployment phase. You have defined your problem, gathered and preprocessed data, and even built a working model. What comes next? Well, before your recommender system goes live, you need to rigorously evaluate its performance and prepare it for deployment. Let's dive into it.