Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

Masters in Business Intelligence and Business Analytics

Market Basket Anaysis: Online Radio

UNIVERSITY OF APPLIED SCIENCES


NEU-ULM

Delivered by

Nishant Chaturvedi
Abstract
In this report we will perform market basket analysis applying association rule and
the concept of lift. Additionally, we will perform this analysis based on three different
independent variables that are community of the users(Original Solution in the
Ledolter book) and then based on two other demographic variables sex and Country.
The results obtained based on these three different variables will be determined
and finally compared with each other. We will also go through certain knowledge
segment that are integral part of market basket analysis before divulging into detailed
solution.

i
Contents
1 Online Radio Recommendation based on User 1

2 Online Radio Recommendation based on Country 3

3 Online Radio Recommendation based on User and Gender 6

ii
1 Online Radio Recommendation
based on User
This is the original solution in the "Ledolter" book. Here, we are trying to gener-
ate a music recommendation system that recommends new music to users in the
community.
This recommendation system will be built with support of 5 percent implies that
only those artists will be captured which appeared simultaneously minimum 5
percent of the times based on different users. The formula used for calculating
support is as follows:
Support = A + B/Total

Confidence interval of 50 percent indicating that only those artists will be included
who appear at least 50 percent of the time alongside any other artist. The formula
used for calculating confidence is given by:
Con f idence = A + B/A

Finally, the output of this will be a subset of only those artists which have the lift
of at least 5 often interpreted as how much our confidence have increased that if the
user is listening to artist A, it will also listen to Artist B. It is given by the following
formula:
Li f t = ( A + B/A)/( B/Total )

Firstly we will plot the frequency plot of those artists which are listened by the
same user at least 9 percent of the time. The following output will be generated

1
1 Online Radio Recommendation based on User

Figure 1.1

Now, we will build the association rule with support > 0.01 and confidence > 0.50.
and finally we will subset the data considering only those artists which are having
lift > 5 and we will get the following associations:

Figure 1.2

From the above output it could be concluded that we have more than 6 pair of
artists that could be recommended to users if he shows interest in either one of them.

2
2 Online Radio Recommendation
based on Country
In this section we will create an association between artists based on user and
Country. The dataset comprises of 15000 users which are distributed among 159
different Countries, firstly we will try to create a frequency curve with support >0.09
and confidence > 0.05

Figure 2.1

As from figure 2.1 it is very clear that it is difficult to interpret the frequency graph

3
2 Online Radio Recommendation based on Country

wit support > 0.09 and confidence >0.05.


To make it more interpretable we will update the support > 0.5 from 0.09, this will
result in the below figure.

Figure 2.2

In figure 2.2 it can be observed that with increase in support based on Country
there are only 6 artists that have support > 0.5.
Now we will try to create a association map based on Country, however, with
the limitation of processing capability of my system, I was unable to generate an
association matrix for support > 0.01 so I have to increase the support lower limit
to 0.5 and with confidence > 0.5 and lift >5, we got no output as all the artist with
support > 0.5 and confidence > 0.5 have no associated artists with them as their lift
value = 1 as depicted in figure 2.3.

4
2 Online Radio Recommendation based on Country

Figure 2.3

Hence, it could be concluded that it is difficult to stratify association mapping


based on Country as market basket analysis creates millions of networks which
require systems with higher processing capabilities.

5
3 Online Radio Recommendation
based on User and Gender
Similar to the processing issue that we faced while stratifying the recommendation
model using Country, I encountered the same issue while creating association map
based on gender since we only have two genders in our dataset, we will have millions
of connected network which is difficult to process and requires higher processing
power than that of a local machine.

You might also like