Data Mining

Hassan Ahmad
mcs2100440
MSC IT 4th
Assignment # 1
Subject
Data Mining
Submitted
To
Sir, Ghulam Jillani
Q.No:1
How can we mine multilevel and multidimensional associations,
explain both with suitable examples.
Multilevel Association Rule:
Association rules created from mining information at different degrees of reflection are called
various level or staggered association rules.
Multilevel association rules can be mined effectively utilizing idea progressions under a help
certainty system.
Rules at a high idea level may add to good judgment while rules at a low idea level may not be
valuable consistently.
Utilizing uniform least help for all levels:
 At the point when a uniform least help edge is utilized, the pursuit system is
rearranged.
 The technique is likewise straightforward, in that clients are needed to indicate just a
single least help edge.
 A similar least help edge is utilized when mining at each degree of deliberation. (for
example for mining from “PC” down to “PC”). Both “PC” and “PC” discovered to be
incessant, while “PC” isn’t.
Needs of Multidimensional Rule:
 Sometimes at the low data level, data does not show any significant pattern but there
is useful information hiding behind it.
 The aim is to find the hidden information in or between levels of abstraction.
1. Uniform Support(Using uniform minimum support for all level)

2. Reduced Support (Using reduced minimum support at lower levels)
3. Group-based Support(Using item or group based support)
1. Uniform Support –
At the point when a uniform least help edge is used, the search methodology is simplified.
The technique is likewise basic in that clients are needed to determine just a single least
help threshold. An advancement technique can be adopted, based on the information that a
progenitor is a superset of its descendant. the search keeps away from analyzing item sets
containing anything that doesn’t have minimum support. The uniform support approach
however has some difficulties. It is unlikely that items at lower levels of abstraction will
occur as frequently as those at higher levels of abstraction. If the minimum support
threshold is set too high it could miss several meaningful associations occurring at low
abstraction levels. This provides the motivation for the following approach.
2. Reduce Support –
For mining various level relationship with diminished support, there are various elective
hunt techniques as follows.
 Level-by-Level independence –
This is a full-broadness search, where no foundation information on regular item
sets is utilized for pruning. Each hub is examined, regardless of whether its parent
hub is discovered to be incessant.
 Level – cross-separating by single thing –
A thing at the I level is inspected if and just if its parent hub at the (I-1) level is
regular .all in all, we research a more explicit relationship from a more broad one. If
a hub is frequent, its kids will be examined; otherwise, its descendant is pruned
from the inquiry.
 Level-cross separating by – K-itemset –
A-itemset at the I level is inspected if and just if it’s For mining various level
relationship with diminished support, there are various elective hunt techniques.
 Level-by-Level independence –
This is a full-broadness search, where no foundation information on regular item
sets is utilized for pruning. Each hub is examined, regardless of whether its parent
hub is discovered to be incessant.
 Level – cross-separating by single thing –
A thing at the 1st level is inspected if and just if its parent hub at the (I-1) the level
is regular .all in all, we research a more explicit relationship from a more broad one.
If a hub is frequent, its kids will be examined otherwise, its descendant is pruned
from the inquiry.
 Level-cross separating by – K-item set –
A-item set at the I level is inspected if and just if its corresponding parents A item
set (i-1) level is frequent.
3. Group-based support –
The group-wise threshold value for support and confidence is input by the user or expert.
The group is selected based on a product price or item set because often expert has insight
as to which groups are more important than others.
1. Example –
For e.g. Experts are interested in purchase patterns of laptops or clothes in the non and
electronic category. Therefore low support threshold is set for this group to give
attention to these items’ purchase patterns.
Multidimensional Association Rules:
In Multidimensional association rule Qualities can be absolute or quantitative.
 Quantitative characteristics are numeric and consolidates order.
 Numeric traits should be discretized.
 Multidimensional affiliation rule comprises of more than one measurement.
 Example –buys (X, “IBM Laptop computer”) buys (X, “HP Inkjet Printer”)
Approaches in mining multidimensional affiliation rules :

Three approaches in mining multidimensional affiliation rules are as following.
1. Using static discretization of quantitative qualities:
 Discretization is static and happens preceding mining.

 Discretized ascribes are treated as unmitigated.
 Use apriorism calculation to locate all k-regular predicate sets(this requires
k or k+1 table outputs). Each subset of regular predicate set should be
continuous.
Example –
If in an information block the 3D cuboid (age, pay, purchases) is continuous suggests
(age, pay), (age, purchases), (pay, purchases) are likewise regular.
Note –
Information blocks are appropriate for mining since they make mining quicker. The
cells of an n-dimensional information cuboid relate to the predicate cells.
2. Using powerful discretization of quantitative traits:
 Known as mining Quantitative Association Rules.

 Numeric properties are progressively discretized.
Example –:
Age (X, "20..25") Λ income (X, "30K..41K")buys ( X, "Laptop Computer")
3. Grid FOR TUPLES:
Using distance-based discretization with bunching –

This id dynamic discretization measure that considers the distance between
information focuses. It includes a two-stage mining measure as following.
 Perform bunching to discover the time period included.
 Get affiliation rules via looking for gatherings of groups that happen
together.
The resultant guidelines may fulfill –
 Bunches in the standard precursor are unequivocally connected with
groups of rules in the subsequent.
 Bunches in the forerunner happen together.
 Bunches in the ensuing happen together.
Quantitative Associations:
Quantitative association rules refer to a special type of association rules in the form of X → Y,
with X and Y consisting of a set of numerical and/or categorical attributes. Different from general
association rules where both the left-hand and the right-hand sides of the rule should be categorical
(nominal or discrete) attributes, at least one attribute of the quantitative association rule (left or
right) must involve a numerical attribute. Examples of this type of association rule can be
categorized into the following two classes, depending on whether the rules are measured by the
frequency of the supporting data records or by some distributional features of some numerical
attributes.
Negative Correlations:
Two people or situations (known as variables) with a negative correlation have an inverse
relationship, which means one increases as the other decreases. Think of school absences, for
example: The higher the number of absences, the lower a student's grades will be. Although
negative correlation is a common part of psychological and statistical analysis, you can also find
examples of negative correlation all around you every day.
Common Examples of Negative Correlation

Not every change gives a positive result. These different examples of negative correlation show
how many things in the real world react inversely.
 A student who has many absences has a decrease in grades.
 The more one works, the less free time one has.
 As one increases in age, often one's agility decreases.
 If a car decreases speed, travel time to a destination increases.
 The more time you study or prepare for a test, the fewer mistakes you'll make.
 When you spend more time brushing your teeth, you'll have fewer cavities.
 If a car tire has more air, the car may use less gas per mile.
Compressed Pattern:
A major challenge in frequent-pattern mining is the sheer size of its mining results. In many
cases, a high min sup threshold may discover only commonsense patterns but a low one
may generate an explosive number of output patterns, which severely restricts its usage. In
this paper, we study the problem of compressing frequent-pattern sets. Typically, frequent
patterns can be clustered with a tightness measure δ (called δ-cluster), and a representative
pattern can be selected for each cluster. Unfortunately, finding a minimum set of
representative patterns is NP-Hard. We develop two greedy methods, RPglobal and
RPlocal. The former has the guaranteed compression bound but higher computational
complexity. The latter sacrifices the theoretical bounds but is far more efficient. Our
performance study shows that the compression quality using RPlocal is very close to
RPglobal, and both can reduce the number of closed frequent patterns by almost two orders
of magnitude. Furthermore, RPlocal mines even faster than FPClose[11], a very fast closed
frequentpattern mining method. We also show that RPglobal and RPlocal can be combined
together to balance the quality and efficien
Problem Statement
In this section, we first introduce a new distance measure on closed frequent patterns, and
then discuss the clustering criterion.
Distance Measure:
Let P1 and P2 be two closed patterns. The distance of P1 and P2 is defined as: D(P1, P2) =
1 − |T(P1) ∩ T(P2)| |T(P1) ∪ T(P2)|
Example:
Let P1 and P2 be two patterns: T(P1) = {t1, t2, t3, t4, t5} and T(P2) = {t1, t2, t3, t4, t6},
where ti is a transaction in the database. The distance between P1 and P2 is D(P1, P2) = 1 −
4 6 = 1 3 . Theorem 1 The distance measure D is a valid distance metric, such that: 1. D(P1,
P2) > 0, ∀P1 6= P2 2. D(P1, P2) = 0, ∀P1 = P2 3. D(P1, P2) = D(P2, P1) 4. D(P1, P2) +
D(P2, P3) ≥ D(P1, P3), ∀P1, P2, P3

Data Mining

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Data Mining

Uploaded by

Copyright:

Available Formats

Hassan Ahmad

Multilevel Association Rule:

Utilizing uniform least help for all levels:

Needs of Multidimensional Rule:

1. Uniform Support(Using uniform minimum support for all level)

Approaches in mining multidimensional affiliation rules :

1. Using static discretization of quantitative qualities:

 Discretization is static and happens preceding mining.

2. Using powerful discretization of quantitative traits:

 Known as mining Quantitative Association Rules.

Using distance-based discretization with bunching –

Common Examples of Negative Correlation

You might also like