Assignment 3 - ADM 3308

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

ADM3308-Fall 2023: Business Data Mining

Assignment #3 (Group Work)


_____________________________________________________________________

Submission Instruction:
• Submit the assignment to Brighspace® no later than midnight on Dec. 04,
2023.
• Only one submission per group. Please choose a group representative to submit
one copy of the report to BrightSpace

Weight: 10% of the final course mark.


______________________________________________________________________

Statements on Group Contribution and Academic Integrity

When submitting your group work, the submission must include the following two
statements. Without the following two statements included in your submission, your
assignment will not be marked.

(a) Statement of Contributions


In a brief statement, explain the contribution of each group member to the
assignment. Mention the name of the group member and the specific tasks (or
items) accomplished by that group member as their contribution to the assignment.

Reana completed, Q1,3,4,7,8,10


Ayoub completed, Q2,6
Gabriel completed Q5,9

(b) Academic Integrity Statement


Each individual member of the group must read the Academic Integrity Statement,
and type in their full name and student ID. The Academic Integrity Statement must
be e-signed by ALL members of the group UNLESS a group member has not
contributed to the assignment.

NOTE: If the above two statements are not included in your original submission, the
assignment will not be marked. Then, the following deductions will be applied:

-20% if the statement was not submitted with the original submission, but was
submitted after reminded by the TA within 24 hours.

-100% if the statement was not submitted within 24 hours after reminded by the
TA.

University of Ottawa Telfer School of Management Page 1 of 13


ADM3308-Fall 2023: Business Data Mining

Important Note: Each member of the group must read the following academic
integrity statement, and type in their full name and student ID. The Academic
Integrity Statement must be e-signed by ALL members of the group UNLESS a group
member has not contributed to the assignment. Your assignment will not be marked
if the following academic integrity statement is not submitted.

Statement of Academic Integrity


Group Assignment Checklist & Disclosure

Please read the disclosure below following the completion of your group assignment.
Once all team members have verified these points, hand in this signed disclosure with
your group assignment.
1. All team members acknowledge to have read and understood their responsibilities for
maintaining academic integrity, as defined by the University of Ottawa’s policies and
regulations. Furthermore, all members understand that any violation of academic
integrity may result in strict disciplinary action as outlined in the regulations.
2. If applicable, all team members have referenced and/or footnoted all ideas, words, or
other intellectual property from other sources used in completing this assignment.
3. A proper bibliography is included, which includes acknowledgement of all sources used
to complete this assignment.
4. This is the first time that any member of the group has submitted this assignment or essay
(either partially or entirely) for academic evaluation.
5. No member of the team has utilized unauthorized assistance or aids including but not
limited to outsourcing assignment solutions, and unethical use of online services such as
artificial intelligence tools and course-sharing websites.
6. Each member of the group has read the full content of the submission and is assured that
the content is free of violations of academic integrity. Group discussions regarding the
importance of academic integrity have taken place.
7. All team members have identified their individual contributions to the work submitted
such that if violations of academic integrity are suspected, then the student(s) primarily
responsible for the violations may be identified. Note that the remainder of the team will
also be subject to disciplinary action.
Course Code: ADM3308, Fall 2023
Group Number: Group 8
Assignment # or Title: Assignment #3
Date of Submission: December 4, 2023

Student Full Name Signature

Reana Agil R.A

University of Ottawa Telfer School of Management Page 2 of 13


ADM3308-Fall 2023: Business Data Mining

Student Full Name Signature

Ayoub Essadouky A.E

Gabriel Torres Stelluto G.T.S

University of Ottawa Telfer School of Management Page 3 of 13


ADM3308-Fall 2023: Business Data Mining

Assignment #3 (Group Work)


_____________________________________________________________________

Part I- Review Questions

Q1) [8 marks] We learned, in general, about some of the deep learning techniques such as
CNN, RNN, and LSTM. Explain (in one or two paragraphs) your general understanding of
convolutional neural networks (CNN) and mention two practical applications where CNN
is employed.

CNN, also known as Convolutional Neural Networks, is a class of deep neural networks
which is most applied to analyzing visual imagery. CNNs application include video
understanding, speech recognition and understanding natural language processing. CNN in
comparison to areas like RNN can be best explained as though RNN is more helpful data
processing and predicting whereas CNN is able to help in visual analyzing. One area of
issue for CNN however is its limitations in analyzing visuals. A typical CNN can tell the
type of object but not its location, CNN can also only regress one visual at a time. For
example, if we were analyzing a photo of birds, CNN can detect the bird show in the model
but if there were two birds of different species within the same visual field it would not be
able to detect that.

CNN is essentially made up of aggregate predictors (pixels) This means rather than having
weights for each pixel, group pixels together and apply the same operation of convolution.
The filter matrix in CNN moves across an image, storing the results, and yielding a smaller
matrix whose values indicate the presence or absence of a vertical line. Additionally,
similar filters can detect horizontal lines, curves and borders and further convolutions can
be applied to these local features. This all results in a multi-dimensional matrix of higher-
level features. Typically, a common aggregation is a 3x3 pixel area, for example, enlarging
the small area of someone's face such as a chin etc. Another practical example of CNN can
be found in automatic photo organization in social media platforms. Due to its object
detection abilities, CNNs are able to identify and locate multiple objects within an image.
This capability is crucial in multiple scenarios of shelf scanning in retail to identify out-of-
stock items. (n.d, 2023)

Q2) [20 marks] Search for an AI tool that you can experiment with, except ChatGPT,
iPhone Siri, Microsoft Cortana, and Google Assistant. You may choose a web-based tool,
or an AI tool that can be installed on your smartphone or your computer. Do some
experiments with the tool and report them. Your explanation of this activity should include
the functionalities of the AI tool (cite the references), and an example experiment you
performed using the tool.

AI Tool: FastPhotoStyle

NVIDIA’s FastPhotoStyle differentiates itself from other AI experiments as it can take any
input image and produce a cool and stylized photo-realistic output image. The model

University of Ottawa Telfer School of Management Page 4 of 13


ADM3308-Fall 2023: Business Data Mining

focuses on stylizing and smoothing out input images, and subsequently generating the
reference image’s style with results that can convince the human eye of its authenticity.

FastPhotoStyle use deep neural networks for style transfer. During training, the model
learns to separate the content and style information from the input images.

The model is trained to minimize the difference between the stylized output and the ground
truth stylized images in the training set.

The Experiment I did:

• I Created an output folder and made sure nothing is inside the folder
• Went to the image folder
• Downloaded content image 1
• Downloaded style image 1
• These images are huge. I resized them first. Run
• Went back to the root folder
• Tested the photorealistic image stylization code
• The computational bottleneck of FastPhotoStyle is the propagation step (the
photorealistic smoothing step). We find that we can make this step much faster by
using the guided image filtering algorithm as an approximate.
• Stylized photo-realistic output image created

Q3) [6 marks] (a) Explain three combination functions that can be used in the K-Nearest
Neighbor algorithm to make the final decision. Mention an example where you would
recommend using each of the combination functions? (b) Explain “collaborative
filtering”? Give two examples.

To begin, KNN algorithm is a combination of functions that are utilized to aggregate the
decisions predictions made by individual neighbors. Here are three common combination
functions and scenarios where each might be recommended.

A) The three main functions are majority voting, weighted voting, and distance-
weighted voting. In majority voting, each neighbours gets an equal vote, and the
final decision is based on the most frequent class among the k-nearest neighbours.
For example, majority voting is suitable when dealing with classification problems
where the goal is to assign an instance to a specific class. A real-life example is a
spam email classification task. In weighted voting, there is an assignment of
different weights to the votes of each neighbour based on their proximity or
similarity to the query instance. Therefore, closer neighbours may have a greater
influence on the final decision. For example, this is beneficial when the importance
of neighbours varies. A real-life example of this, in medical diagnosis scenario,
where attributes of closer neighbours might be more relevant, in the end, weighted
voting can enhance the accuracy of predictions by considering the significance of
each neighbour's input. Finally, there is distance-weighted voting, like weighted

University of Ottawa Telfer School of Management Page 5 of 13


ADM3308-Fall 2023: Business Data Mining

voting, distanced-weighted voting assigns weights to neighbour based om their


distance from the query instance. Closer neighbours receive higher weights,
emphasizing the influence of nearby instances. For example, distance-weighted
voting is useful when the assumption is that closer neighbours are more likely to
provide accurate predictions. In a real-life example, in recommendation system for
movies, considering the ratings of users how are geographically or behaviorally
closer might result in more relevant suggestions.
B) Collaborative filtering is a technique used in recommendations systems to predict
a user’s preferences based on the preferences of other users. It relies on the idea
that users who agreed in the past tend to agree again in the future. It has two main
senses, a narrow one and a more general one. Two main examples, user-based
collaborative filtering and item-based collaborative filtering. In the approach of
user-based collaborative filtering, recommendations are made by identifyig users
with similar preferences. I.e., if user A and user B liked similar movies in the past
and in the current time user A likes a new movie, user B might be recommended
that same movie. With regard to item-based collaborative filtering, this method
recommends items based off of their similarities to items previously liked by the
user. For example, if a user might have a positively rated set of books, the system
would be able to recommend other books that are like that content under the
guidance of those that the user has read or enjoyed.

Q4) [6 marks] Describe “Single Linkage”, “Complete Linkage”, and “Centroid Distance”?
Then, explain why we may consider each of these distance measures?

Beginning off with “Single Linkage”, this is an agglomerative hierarchical clustering


algorithm. It takes the distance between two clusters and measures the distance between
the closest data points in the first cluster and the closest data points in the second cluster.
It is then able to join the two clusters with the shortest single linkage distance (I.e. the
shortest distance between the data points). It may then produce clusters in instances where
observations of different clusters are closer together than to observations within their own
clusters.

Moving onto “Complete Linkage”, this is also an agglomerative hierarchical clustering


algorithm. However, it finds the distance between two clusters by mearing the distance
between the farthest data points in the first cluster and the farthest data points in the second
cluster. It is then able to join two clusters where the distance between their furthers
members is the same as the distance between their closest members.

Finally, “Centroid Distance”. This is the middle ground between single and total
connectivity. The cluster distance is calculated between two cluster centroids by taking the
distance between them. The cluster centroid is the cluster’s mean attribute values. This
means that it brings together the two clusters with the shortest centroid distance. This
happens when the clusters that are being merged are not like themselves. For example, in
the other linkages when cluster merge those clusters are similar in many ways than being
like the new larger cluster.

University of Ottawa Telfer School of Management Page 6 of 13


ADM3308-Fall 2023: Business Data Mining

Q5) [6 marks] (a) Explain the 3 different types of rules that can be generated by association
rule analysis (actionable, trivial, and inexplicable rules), and provide an example of each?
(b) Explain 3 examples where Association Rules can be used in business applications.
− Actionable Rules: Rules that provide high-quality information that is useful for the
business, however some action needs to be taken. This rules can provide the
business with important insights.
o Example: A website can change its site design to suggest product X for
people that have added product Y to their cart. This is because the rule
would be: “If a client that buys product Y, he is 50% to also buy product X”
− Trivial Rules: Rules that anyone that is familiar with the business would understand
and that do not offer lots of information. They are very straightforward and come
from high frequency.
o Example: “People who buy high value graphic cards (gpus) are likely to
own a high value computer”. It doesn’t indicate anything as this would be
obvious and doesn’t give any very useful information.
− Inexplicable Rules: Rules that are hard to interpret because they don’t show clear
associations.
o Example: “People who spend more thatn 2 minutes at the “FAQ” page are 30%
less likely to leave the website without a purchase”. This doesn’t have a clear
explanation and can be investigate to better understand customers and
improve the business.

Part II- Problems


[10 marks for each problem]

Q6) We would like to predict David’s response to a marketing campaign using the
following labeled data. Predict David’s response using K-Nearest Neighbor algorithm,
with K=3, and Manhattan distance, also, considering the voting weight of each customer
as indicated in the dataset below.
Remember to first normalize each feature between [0, 1].
Show the steps of your calculations.

University of Ottawa Telfer School of Management Page 7 of 13


ADM3308-Fall 2023: Business Data Mining

Customer Age Income No. Response to


(K) credit promotion,
cards Voting
weight
John 35 35 3 No,
weight=2
Maryam 22 50 2 Yes,
weight=2
Rachel 63 200 1 No
Weight=1
Ahmad 59 170 1 No,
Weight=1
Hannah 25 40 4 Yes
Weight=1
David 37 50 2 ?

David’s response would be NO: the weight of NO response > the weight of YES
response

Q7) Consider the following data set

D={ (2,0), (2,1), (1,2), (3,2), (2,3), (3,3), (2,4), (3,4), (4,4), (3,5) }

University of Ottawa Telfer School of Management Page 8 of 13


ADM3308-Fall 2023: Business Data Mining

Apply K-means clustering with k=2, using Manhattan distance. Start with (2,0) and
(3,5) as the initial centroids (seeds). Show all intermediate clusters, as well as the
final clusters.

Given data sets: {A1(2,0), A2(2,1), B1(1,2), B2(3,2), C1(2,), C2(3,3), D1(2,4), D2(3,4),
E1(4,4), E2(3,5)}

Given Centroid: C1 = 2,0 & C2=3,5

Step 1: Find the minimum distance between the two centroids -

Table of immediate clusters -

C1 (2,0) C2 (3,5) Centroid (Min)


A1 (2,0) = |2-2| + |0-0| =|2-3| + |0-5| Min (,-6)Cen 1
=0 =-6
A2 (2,1) 1 5 Cen 1
B1 (1,2) 1 5 Cen 1
B2 (3,2) 3 3 Cen 1
C1 (2,3) 3 3 Cen 1
C2 (3,3) 4 2 Cen 2
D1 (2,4) 4 2 Cen 2
D2 (3,4) 5 1 Cen 2
E1 (4,4) 6 2 Cen 2
E2 (3,4) 6 0 Cen 2

Step 2: Re-compute Centroids using Manhattan distance function -

Cen 1:
Cen 1 = Average (A,B)
= (2+2+1+3+2/5) , (0+1+2+2+3/5)
= (10/5) , (8/5)
= (2, 1.6)

Cen 2:
Cen 2 = Average (C,D)
= (3+2+3+4+3/5) , (3+4+4+4+5/5)
= (15/5) , (20/5)
= (3,4)

Therefore, our two new centroids are C1 and C2, (2,1.6) and (3,4).

With a final clustering:


= {(2+3)/5, + (1.6+4)?5
= (1,1.12)

University of Ottawa Telfer School of Management Page 9 of 13


ADM3308-Fall 2023: Business Data Mining

Therefore, with the new centroids and with k=2 value, we can determine that clustering 2
times we will have to repeat the same table until we get only one minimum value. Which
means ultimately only have one Cen 1 and one Cen 2 in the third column.

Q8) Consider the following data set

Data = { <1,1> , <2,2>, <5, 2>, <6,1> }

(a) Cluster the data using agglomerative clustering technique with single linkage.
Show the similarity (distance) matrix at each step. Use Manhattan distance
function.

A (1,1) B (2,2) C (5,2) D (6,1)


A (1,1) 0 2 5 5
B (2,2) 2 0 3 5
C (5,1) 5 3 0 2
D (6,1) 5 5 2 0

Smallest distance = 2

Merge column A and B into M

M C D
M 0 3 5
C 3 0 2
D 5 2 0

Smallest = 2
Merge columns C and D into N

M N
M 0 3
N 3 0

Distance = 3

(b) Cluster the data using agglomerative clustering technique with complete linkage.
Show the similarity (distance) matrix at each step. Use Manhattan distance
function.

A (1,1) B (2,2) C (5,2) D (6,1)


A (1,1) 0 - - -
B(2,2) |2+1| = 3 0 - -

University of Ottawa Telfer School of Management Page 10 of 13


ADM3308-Fall 2023: Business Data Mining

C(5,2) |4+1| = 5 |3+2| = 5 0 -


D(6,1) |5+1| = 6 |4+2| = 6 |1+2| = 3 0

Merging (2,2) and (5,2)

A(1,1) BC (2,2,5,2) D(6,1)

A(1,1) 0 - -
BC (2,2,5,2) |3+5| = 8 0 -
D(6,1) |6+1| = 7 |6+2| = 8 0

Merging (1,1) and (6,1)

AD (1,1,6,1) BC (2,2,5,2)
AD (1,1,6,1) 0 -
BC (2,2,5,2) |7+1| = 8 0

Merging (1,1,6,1) and (2,2,5,2)

ABCD (1,1,6,1,2,2,5,2)
ABCD (1,1,6,1,2,2,5,2) 0

Distance = 0, Therefore, we have now been able to cluster all data points into a single
cluster, {(1,1), (2,2), (5,2), and (6,1)}.

(c) Draw dendrogram using single linkage. Use Manhattan distance function.

8 (1,1)
7 (5,2,2,2)
6 (6,1,5,2)
| |
1 2
The horizontal dotted lines represent the merging of clusters, whereas the dashed vertical
lines represent the individual data points or clusters at each level. The numbers on the left
represent the distance at which the clusters are merged. Finally, at the bottom, you start
with the individual data points and as you move up the clusters are formed by merging
the ones based on Manhattan distance until the bottom where all points are finally part of
a single cluster.

Q9) Consider the following data set


Data = {1, 2, 4, 5, 7, 8, 9}

University of Ottawa Telfer School of Management Page 11 of 13


ADM3308-Fall 2023: Business Data Mining

Assume that we applied two different approaches to cluster this dataset. In the first
approach, we clustered the dataset into 3 clusters as follows:

C1={1, 2}, C2={4, 5}, C3={7, 8, 9}

In the second approach, we clustered the same dataset into 2 clusters as follows:

C1={1, 2, 4, 5}, C2={7, 8, 9}

Calculate the Dunn index for each approaches.


Based on your calculated Dunn index, which approach generated better clusters?

First Approach Dunn Index: Dunn Index = 2/5 = 0.4


Second Approach Dunn Index: Dunn Index = 6/6 = 1

The second approach is better because it has a higher Dunn Index

Q10) A coffee shop’s transaction records of 200 individual sales shows that 50 customers
ordered only coffee, 50 customers ordered coffee and donut, 30 customers ordered coffee
6and chicken wrap, 20 customers ordered only donut, 20 customers ordered only chicken
wrap, and 30 customers ordered donut and chicken wrap (but not coffee).

a) Derive the co-occurrence matrix based on the transactions described above.


Coffee Donut Chicken Wrap
50 50 30
20 20 30

b) Calculate support, confidence, and lift values for the following two rules:

From the co-occurrence matrix, we can now calculate the support, confident and lift
values for the two rules. The calculations will be:

(1) Customers who order coffee, most likely order donut as well.

Rule #1 – Customers who ordered coffee will most likely order a donut as well:

Support: 30/200
Confidence: 50/50
Lift: 50/20
(2) Customers who order chicken wrap, most likely order coffee as well.

Rule #2 – Customers who order a chicken wrap will most likely order a coffee as well:

Support: 30/200
Confidence: 30/30
Lift: 30/20

University of Ottawa Telfer School of Management Page 12 of 13


ADM3308-Fall 2023: Business Data Mining

c) Which one of the two rules is a strong rule? Why?

From the calculations above, we can identify Rule #1 as the stronger rule. This is because
the support, confidence, and lift values are all values greater than rule number #2.

University of Ottawa Telfer School of Management Page 13 of 13

You might also like