Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 5

JINNAH UNIVERSITY FOR WOMEN

Department of Computer Science & Software Engineering


Data Mining
Assignment # 2

Submission Date: 20/5/2019

Q 1 . The algorithm that we used to do association rule mining is the Apriori algorithm. This
algorithm is efficient because it relies on and exploits the Apriori property. What is the
Apriori property?

Answer: The Apriori property is the property showing that values of evaluation criteria of sequential
patterns are smaller than or equal to those of their sequential sub patterns.

Q 2 . Consider the following transaction database. Apply Apriori algorithm to discover strong
association rules. Assume min support=40%

Answer:

Q 3 . Discuss the basic difference between the agglomerative and divisive hierarchical clustering
algorithms and mention which type of hierarchical clustering algorithm is more commonly
used.
Answer:
AGGLOMERATIVE DIVISIVE
Start with single-instance clusters. Start with one universal cluster.
At each step,joins the two closest clusters. Find two clusters.
Design decision: distance between clusters. Proceed recrusively on each subset.
The algorithm stops when all all clusters are Can be very fast.
combined into a singlg cluster.
It uses Dendogram. This method also uses Dendogram.
Q 4 Consider the unit given below. what will be the output value y of the unit for input and
weight given below

Answer:
Formula = ∑(xw)
=1(+1) + 2(+1) + 5(-1) + 8(+2)
=1+ 2 -5 + 16
=14
=Y(threshold) = 1
Q 5 . Consider the following fuzzy expert system for weather forecast:

The following two plots represent the membership functions of two fuzzy
variables describing the position of the arrow of barometer (left) and the direction of its
movement (right):

The air pressure is measured in millibars, and the speed of its change in millibars per hour.
Answer the following questions:

a) How much is the arrow Down, Up or in the Middle if it indicates that the pressure is
1020 millibars? Use membership functions on the graphs.
Answer:

b) How much is the arrow moving Down or Up if the pressure changes -2 millibars every
hour?
Answer:

Q 6 . What are the operators of genetic algorithm?


Answer:
 Reproduction(Selection)
 Encoding
 Crossover(Recombination)
 Mutation
 Accepting
Q 7 . Suppose a genetic algorithm uses chromosomes of the form x = abcdefgh with a fixed
length of eight genes. Each gene can be any digit between 0 and 9. Let the fitness of
individual x be calculated as:

f(x) = ab − cd + ef − gh
and let the initial population consist of four individuals with the following
chromosomes:
x1 = 6 5 4 1 3 5 3 2
x2 = 8 7 1 2 6 6 0 1
x3 = 2 3 9 2 1 2 8 5
x4 = 4 1 8 5 2 0 9 4
Evaluate the fitness of each individual, showing all your working and arrange them in order with
the fittest first and the least fit last.
Answer:
S.No Initial Population Fitness
f(x) = ab − cd + ef -gh
65413532 9
87126601 23
23921285 -16
41852094 -19
Arrangement according to fitness x2,x1,x3,x4

Q8. Every data Structure in Data warehouse contains time element. Explain why?
Answer:
Every data structure in data warehouse contain time element: Because of the nature of its purpose,
it has to contain historical data, not just current values.

Q9. Suppose that a data warehouse consists of the four dimensions, date, spectator, location,
and game, and the two measures, count and charge, where charge is the fare that a spectator
pays when watching a game on a given date. Spectators may be students, adults, or seniors,
with each category having its own charge rate.
a) Draw a star schema diagram for the data warehouse.
Answer:
b) Starting with the base cuboid [date; spectator; location; game], what specific OLAP
operations should one perform in order to list the total charge paid by student spectators at
GM Place in 2004?
Answer:

Perform slice on condition: game ==’Game Place’

Then slice again on date.year

You might also like