Starbucks Capstone Challenge Using FunkSVD!

Let’s start understanding and analyze Starbucks customers on how they respond to various offers so that Starbucks can in return help their customers in a better way possible!

7 min readFeb 18, 2021

Introduction:

Starbucks is one of the largest and famous coffee brands in the world and it has at least one store down the street in the US! In the US alone, Starbucks managed to attract 18.9 million people and this number is growing steadily and worldwide it has 31,256 stores!!!So Starbucks does have a greater responsibility to provide great value to their customers with various offers so that they can get these customers quite regularly and also new potential customers by referring and many ways. As part of the Udacity Nano Degree program, let’s try to unwind and explore how the members respond to various offers in Starbucks. I thank Udacity and Starbucks for providing the simulated data for this capstone challenge and now its time to jump in the project :)

Our Strategy:

I recommend everyone to follow the CRISP-DM process while working on data science projects so that you can analyze and answer all the questions.

CRISP-DM is a common process used to find many solutions in Data Science.

The phases of this CRISP-DM process includes:

Business Understanding
Data Understanding
Data Preparation
Data Modelling
Evaluation
Deployement

The data sets for this project are provided by Starbucks & Udacity in three files:

portfolio.json — containing offer ids and metadata about each offer (duration, type, etc.)
profile.json — demographic data for each customer.
transcript.json — records for transactions, offers received, offers viewed, and offers complete.

To gain insights from these data sets, we would want to combine them and then apply data analysis and modeling techniques to them. Let’s see the JSON files below:

transcript

event (str) — record description (ie transaction, offer received, offer viewed, etc.)
person (str) — customer id
time (int) — time in hours since the start of the test. The data begins at time t=0
value — (dict of strings) — either an offer id or transaction amount depending on the record

portfolio

id (string) — offer id
offer_type (string) — a type of offer ie BOGO, discount, informational
difficulty (int) — the minimum required to spend to complete an offer
reward (int) — the reward is given for completing an offer
duration (int) — time for the offer to be open, in days
channels (list of strings)

profile

age (int) — age of the customer
became_member_on (int) — the date when the customer created an app account
gender (str) — gender of the customer (note some entries contain ‘O’ for other rather than M or F)
id (str) — customer-id
income (float) — customer’s income

Data Preparation:

This is considered a crucial step and it takes most of the time before the modeling task and the deployment task. Using various techniques in data cleaning like removing rows and columns which are not required, One-hot encoding, combining data frames, etc are used to tweak the data frames so that we will be ready for the modeling task!

In this stage the data which is cleaned and processed will look like this:

We will use this **cleaned_portfolio** for modeling!

Modeling:

I decided to create a user_item matrix to apply the famous FunkSVD algorithm for recommending new offers to the customers of similar offers and also to recommend offers that are the best offers to the potential new customers. FunkSVD uses matrix factorization and we have latent factors are created by FunkSVD to find similar customers and provide them similar offers to the existing customers. Netflix uses the recommendation algorithms like FunkSVD to recommend you better movies/shows based on your taste and interest and if you are a new user on Netflix it will either ask you to like some of the genres according to your interest or if you skip that it will recommend you with the super hits and the more popular shows/movies! In this manner itself, we will use the famous FunkSVD algorithm.

Building the user_item matrix will take a longer time because of many records present and you have to compute for the train user_item matrix as well as for the test user_item matrix.

Algorithm:

We use FunkSVD to split the matrix into the user matrix, latent feature matrix, and offer matrix. FunkSVD is used because there are missing values inside the matrix and normal SVD just doesn’t work. To test our prediction on the test set of user_item matrix, we split the records into the training set(70%) and the testing set(30%). We want to use the previous records (earlier time) as the training set to build our model, and then use the same model to test the later records.

Metrics:

We can evaluate our model using Mean Squared Error and keep track of the iterations of the FunkSVD algorithm that we used. The algorithm is as follows: For each user-offer pair, if we sent the offer, then compute the error as the actual minus the dot product of the user and offer latent features, and then we sum up all square errors for the matrix

Our user_item matrix looks like this:

(the user will be the real user_id and the item will be the offer_id):

**This is just the .head() representation as to the user_item matrix I got is pretty huge! And this is with 10 latent factors(Default I considered).**

Evaluation:

By performing the FunkSVD on the train user_item matrix with latent features of 5,10,15. Then we perform the Mean Squared Error(MSE) for the test data and For 15 latent features, we have MSE of 0.003823; For 10 latent features, we have MSE of 0.006241; For 5 latent features, we have MSE of 0.022370. From our experimentation, 15 latent features model is efficient than others because the more the latent features, the better will be the correlation between the users and the offers.

What will be the offer recommendations for an existing customer?

Here we get the user_mat and the offer_mat from the FunkSVD algorithm. user_mat is [user_id(rows) X latent_factors(cols)] and the offer_mat is [latent_factors(rows) X offer)id(cols)] and in the below representation, for the user_id ‘0610b486422d4921ae7d2bf64640c50b’ we are recommending him more offers based on his current offers as he is an existing customer!

What will be the new recommendations for a potential new customer?

You can see that Offer 7 is the best offer with high sales that means that people liked offer 7 and also the rest recommendation of offers!

What percent of the gender population is considering the ‘BOGO’ offer( buy one get one) and standard discount offer?

We can see that for the male population, ‘discount’ is mostly preferred than ‘bogo’ and for the female population at Starbucks they slightly prefer the ‘bogo’ offer than the ‘discount’ offer and for other population and people who do not want to share their gender slightly prefer the ‘discount’ offer than the ‘bogo’ offer.

What percent of the gender population is responding to certain platforms for offers like ‘web’ source/ ’email’ / ’mobile’ / ’social’?

we can clearly see that all the offers to which all population responded are via ‘Social’. In the majority of the female population, they responded to the offers via ‘Social’ and next is via ‘mobile’ SMS and for the male population, they responded to the offers via ‘Social’ more and then via ‘mobile’ and then via ‘web’!

Improvement:

Even though we have built a recommendation engine, but it doesn't seem helpful for potential new customers. So, for the new customers, I have used the rank-based recommendation systems to give out out the best offers. This problem of ‘new customers’ is also called the ‘Cold Start Problem’ in data science. So now this is being solved by using the rank-based system, we can go ahead with recommending the old customers as well as potential new customers!

Conclusion:

We answered all the questions like providing new recommendations to new and existing users, finding out the best selling offer, population response to certain offers, and the ‘medium’ through which they respond to certain offers, and is said that there are many ways in Data science and there is no right or wrong answer as such.

Promotion through social media platforms is the best!

This is my Second data science blog, hope you enjoy it, thank you for your attention and any kind of advice or any thing that should be improved from further blogs are welcomed as it will really help me to improve and learn!

To view the code:

kaushik-42/Data_Science_Projects

Contribute to kaushik-42/Data_Science_Projects development by creating an account on GitHub.

github.com

To connect with me via Linkedin :

https://www.linkedin.com/in/kaushik-tummalapalli/

I would love to connect :) See You in the next blog until then keep Hustling and stay safe!

Next blog will be based on FunkSVD!