Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/279923590

Trends in Quantitative Association Rule Mining Techniques

Conference Paper · July 2015


DOI: 10.1109/ReTIS.2015.7232865

CITATIONS READS
19 2,406

2 authors:

Dhrubajit Adhikary Swarup Roy


North Eastern Hill University Sikkim University
6 PUBLICATIONS 31 CITATIONS 129 PUBLICATIONS 835 CITATIONS

SEE PROFILE SEE PROFILE

All content following this page was uploaded by Swarup Roy on 11 July 2015.

The user has requested enhancement of the downloaded file.


2015 IEEE 2nd International Conference on Recent Trends in Information Systems (ReTIS)

Trends in Quantitative Association Rule Mining


Techniques
Dhrubajit Adhikary1 , Swarup Roy2∗
Department of Information Technology
North Eastern Hill University, Shillong - 793 022
Email: 1 dhrubajit.adhikary@gmail.com, 2 swarup@nehu.ac.in

Abstract—Association rule mining (ARM) techniques are ef- discuss the background of ARM and QARM techniques in the
fective in extracting frequent patterns and hidden associations following section.
among data items in various databases. These techniques are
widely used for learning behavior, predicting events and making II. BACKGROUND
decisions at various levels. The conventional ARM techniques
are however limited to databases comprising categorical data An explicit definition of the ARM problem was given by
only whereas the real-world databases mostly in business and Agrawal et al. [2]. From the perspective of analyzing market-
scientific domains have attributes containing quantitative data. basket data; for a set of items I = {I1 , I2 , I3 , ....., Im } and
Therefore, an improvised methodology called Quantitative As- a database of transactions D = {t1 , t2 , t3 , ...., tn } where
sociation Rule Mining (QARM) is used that helps discovering
hidden associations from the real-world quantitative databases.
each transaction ti ⊆ I ; the ARM problem is to identify
In this paper, we present an exhaustive discussion on the trends all interesting Association Rules (AR) of the form X⇒Y
in QARM research and further make a systematic classification where X ⊂ I, Y ⊂ I are actually the non-empty frequent-
of the available techniques into different categories based on itemsets and X ∩ Y = φ. All such AR, satisfying either or
the type of computational methods they adopted. We perform a both the two thresholds of rule interestingness viz. minimum
critical analysis of various methods proposed so far and present support and minimum confidence; are considered relevant and
a theoretical comparative study among them. We also enumerate
some of the issues that needs to be addressed in future research. informative [3]. Support of an association rule is calculated
as the percentage of records containing X ∪ Y to the total
Index Terms—Association Rules; Quantitative Association number of records in the database. Confidence is the strength
Rules; Clustering; Fuzzy; Evolutionary approach; Information of the implication in a rule; it is calculated as the percentage
theory. of records that contain Y if they contain X.
QARM is an improvised form of association rule mining
I. I NTRODUCTION applied to databases containing both quantitative and categor-
ical attributes. Alike as an AR, a Quantitative Association Rule
With the passage of time and advances in technology, (QAR) is formally an implication of the form X ⇒ Y. The left
the nature and volume of data changed remarkably. Due side of the implication is called antecedent and the right side is
to numerous information sensing and information gathering called succedent (also, consequent). Both these cedents (also
devices and techniques, the modern databases are no more called attribute sets) may be comprised of a single attribute
alike the market-basket databases where classical association trivial cedent, or multiple attributes non-trivial cedent. Unlike
rule mining techniques can be applied. Most modern databases a conventional AR, in a QAR both quantitative and categorical
(in domains like business, health-care, stock-market, bioinfor- attributes may participate as any of the cedents and the
matics etc.) are larger in size, dimensions and contain attributes common measures of rule interestingness i.e. support and
that are quantitative in nature. Therefore, to handle quanti- confidence can be applied to it. Other measures available for
tative data and mine meaningful association rules in multi- evaluating the quality of a QAR are Lift, Leverage, Conviction,
dimensional quantitative databases better ARM techniques Gain, Certainty-Factor etc. which are widely discussed in [4]
became a necessity. Henceforth, the concept of Quantitative and [5]. If both the left and right side of a QAR has trivial
Association Rule Mining was coined. Then onwards a good cedent, it is called single dimensional QAR else it is multi-
number of QARM techniques has been proposed in recent dimensional. An example of a multi-dimensional QAR is given
decades. These techniques follow different trends w.r.t. time below:
and need, and have their own advantages and limitations when
Salary [40k · · · 50k] ∧ M arried : [Y es] ⇒ N umLoans : [2] (1)
applied in different quantitative databases.
In this paper, we try to articulate various trends followed In this QAR, Salary and NumLoans are Quantitative at-
in QARM research since the inception of the concept in the tributes whereas Married is a nominal Categorical attribute
pioneering work by R. Srikant et al. [1]. We present a com- (with categories Yes & No). It is a positive QAR as the
prehensive study of various QARM approaches and analyze consequent is positively correlated with the antecedent. Sim-
a few techniques from each one of them. To begin with, we ilarly, negative rules represented as X ⇒ ¬Y may also be

978-1-4799-8349-0/15/$31.00 ©2015 IEEE 126


discovered, that associate the presence of X to the absence of interval) pair to a boolean attribute interpreting it as a market-
Y in the rule. The antecedent X and consequent Y of a negative basket item. They also identify the min-support and min-
QAR may consist of one or more attributes and accordingly confidence problem lying within this approach. The min-
may be single or multi-dimensional. Importantly, for a QAR to support problem says, if the count of intervals for a quantita-
be negative either antecedent part or consequent part or both tive attribute is high, the support for individual intervals may
of them have to be negated. Following are some examples of be low and so a few QARs comprising this attribute may not
negative QARs, with negation be detected for lacking minimum support. To overcome this
bottleneck, if interval-size is increased to reduce the number
1) within the antecedent :
of intervals the approach still suffers from information loss in
X2 ∈ ¬[q1 , q2 ] ⇒ Y1 ∈ [r1 , r2 ] ∧ Y2 ∈ [s1 , s2 ] .
terms of confidence. This information loss multiplies with the
2) within the consequent :
increase in interval size because some QARs comprising this
X1 ∈ [p1 , p2 ] ∧ Y1 ∈ [r1 , r2 ] ⇒ X2 ∈ ¬[q1 , q2 ] ∧ Y2 ∈
attribute may not be detected for lacking minimum confidence.
[s1 , s2 ].
Moreover, the number of rules generated goes very large and
3) within both antecedent & consequent :
the rule-mining process becomes time consuming as well.
X1 ∈ ¬[p1 , p2 ] ∧ X2 ∈ [q1 , q2 ] ⇒ Y2 ∈ ¬[s1 , s2 ].
In [7] Chan and Au pointed out another weakness of the
Here, X1 , X2 , Y1 and Y2 are quantitative attributes and partitioning approach that lies in the task of deciding appro-
¬[Lb , Ub ] means those intervals other than [Lb , Ub ] where priate thresholds for rule evaluation measures. If thresholds
Lb & Ub mean the lower and upper bound of an interval [6]. are very large, a user may not discover some useful rules and
With these concepts to start with we may move on to if thresholds are very small, the user may get confused by
present the research trends in QARM in the next section. many irrelevant rules. They suggested an objective interest-
ingness measure, called adjusted difference that requires no
user-supplied thresholds. Fukuda et al. [8] use computational
III. QARM: T RENDS & T ECHNIQUES
geometry that seem to enhance the partitioning approach.
Since inception of the problem, numerous contributions They present optimized interval generation in linear time on
are made using different approaches to mine QARs. Each sorted data. However, while dealing large databases, sorting
approach follows a trend and has several techniques within it. of each numeric attribute is a costly affiar and occupy large
Each such technique has its own pros and cons when applied space in memory. Therefore, they propose the randomized
in different scenario. To have a quick glimpse of the trends in bucketing idea to discover optimized ranges for partitioning
QARM, we classify these techniques into certain categories quantitative attributes in huge databases. Commonly used
partitioning methods either follow equal-width or equal-depth
QARM Techniques (also equal-frequency) intervals. Li et al. in [9] clarified that
the equal-width and equal-depth partitioning methods hardly
consider both value similarities and densities simultaneously.
Partitioning Fuzzy Evolutionary Info-Theoretic
So they present an adaptive partitioning approach that merges
smaller intervals to form larger intervals, repeatedly. Their
Clustering Statistical unsupervised partitioning approach initially distributes quanti-
tative attribute value in a different interval and merge adjacent
similar intervals using a certain criterion considering both
Centroid Based Density Based Distribution Analysis Standard Deviation value similarities and densities of quantitative attributes.
Recently, Dancheng et al. [10] introduce a concept that
performs discretization using self-adaptive approach. This
Mean Median Variance
approach generates better and reasonable partitions that gives
high confidence ARs thereby guaranteeing relatively high sup-
port. To find rules, this technique pre-computes the conditional
Fig. 1. Classification of QARM techniques probability density curves of association between the quantita-
tive attributes. Next, it includes some wide ranges as partitions
based on the computational techniques (or approaches) they that have values adjacent to the peak of the curve. Such
adopted (shown in Fig:1). To have a better comprehension values help finding QARs having high confidence and wider
of the same, we review the prominent techniques under each range helps finding QARs having higher support. Finally, these
category and discuss their merits and limitations. partitions are used for apriori association rule mining. Though
self-adaptive based technique helps in discovering stronger
A. Partitioning Approach QARs thereby enhancing rule support & confidence, however
Srikant and Agrawal in their pioneering work [1] introduce it is limited to reveal single numeral to nominal implications
the partitioning approach to convert quantitative attributes into only.
boolean attributes by partitioning large range of quantitative 1) Discussion: The techniques using Partitioning approach
data into disjoint intervals and then map each (attribute, are prone to suffer from Support-Confidence conflict and many

127
rules problem because reasonable range or interval detection straightforward while the complex ones viz. [12], [13] and [14]
is a challenge in partitioning. Majority of the research works uses concepts like hyperplanes, dense-grids etc. Not all cluster-
are inclined towards solving this issue. In addition, the use ing techniques are fairly scalable for high dimensional cases
of user-given thresholds as rule interestingness measure is but most of them yield single as well as multi-dimensional
common throughout most of these techniques that ultimately QARs. Use of min-sup, min-conf thresholds is common but
drives the overall quantity and quality of mined rules. QAR some techniques require the users to specify other thresholds.
mining using partitioning approach also takes higher execution The rules generated by the techniques from this approach are
time and mostly yields single dimensional positive QARs. however only positive.

B. Clustering Approach C. Statistical Approach


Clustering is an effective alternative to find meaningful Various statistical measures (such as mean, median, variance
quantitative regions for the discovery of association rules etc.) are also applied in QARM. Aumann et al. [18] contribute
in quantitative databases. An efficient hierarchical clustering towards QARM research by providing a new statistical theory
algorithm is proposed by Chien et al. in [11] relying upon for defining quantitative association rules. They considered
variation of density to generate intervals for mining QARs. the distribution of the continuous data via standard statistical
DRMiner [12] is a QARM technique that uses the notion of measures. This technique yields positive multi-dimensional
density to handle quantitative attributes. It fairly finds positive quantitative association rules at the expense of larger database
multi-dimensional association rules and because of the density scans and has a feature of generating meaningful sub-rules.
measure it can avoid trivial and redundant rules. DBSMiner Kang et al. [19] propose another way of mining QAR using
[13] aims to scale up well for high dimensional quantitative statistical measures but with a thin influence of the parti-
association rule mining using the notion of density-connected. tioning approach. They adopt a two step process. Firstly it
The N -dimensional quantitative search space is divided into pre-processes all quantitative attributes to convert them into
multiple grids of rectangular units using equal-frequency par- binary attributes and next convert back the binary association
titioning of every attribute such that no such unit apparently rules into QARs. The first step focuses on domain partition
overlaps. In the grid, intersection of one interval from each of quantitative attributes by selecting the partitions whose
attribute is performed to form cells containing records. A cell collective standard deviations is smaller. Once partitions are
is considered dense if quantity of records in it exceeds a ready, binary association rule (BAR) mining is used. Finally,
user defined threshold. Finally, connected-high-density cells the post-processing step integrates the BARs to reconvert them
are united to form clusters. Based on dense grid, Junrui et into corresponding QARs.
al. [14] proposes another method MQAR (Mining Quantitative 1) Discussion: Statistical approach uses various statistical
Association Rule) that can solve the support-confidence con- measures to analyze distributions or to determine reasonable
flict as well as can help to get rid of the noise and redundant intervals for frequent itemset generation. The bipartition tech-
rules. MQAR uses a FP-tree [15] like DGFP-tree that helps niques usually suffer from uneven distributions of values into
to mine association rules in high dimensional databases. The intervals resulting uninteresting QARs. Hence, measures like
DGFP-tree compress the database effectively and hence there standard deviation came into use. However, rules generated by
is no need of scanning the database many times. MQAR uses statistical techniques are positive and multi-dimensional but
a novel grid and density based clustering to cluster database rule-generation is costly due to greater number of database
using dense subspaces in the tree. MQAR is scalable in dealing scans.
with high dimensional databases. Miller and Yang [16] identify
the flaws of the equal depth partitioning method and present a D. Fuzzy Approach
technique that suggests to discover clusters in the quantitative Sometimes, detected intervals of attributes participating in
attributes and use the resultant clusters as itemsets for finding a QAR may not be precise or meaningful for analysts to
boolean association rules. A further extension of this idea can unleash non-trivial knowledge with ease. Instead of using
be found in the work of Tong et al [17] where they improved intervals, if linguistic terminologies could be used to represent
the partitioning method by taking into account the relations certain associations, frequent patterns etc. then knowledge
among attributes all together. Initially, they form clusters using discovery could be more easier. Some applications may cosider
quantitative attributes. Next, they map each cluster into the an AR interesting if only it discovers association amid some
domain of the quantitative attribute. Thus, these projections useful concepts, such as regular customer, low salary and
return overlapped intervals which they consider reasonable new house [20]. Such concepts may be modelled using fuzzy
and assert to be a good resolution to the min-support and concept using fuzzy set theory. The rules having these terms
min-confidence issues in partitioning approach. are called fuzzy association rules and the approach is con-
1) Discussion: The techniques using Clustering approach sidered as Fuzzy QARM approach. Zhang [20] formulates a
mostly concentrate on reasonable interval generation using the method to mine rules containing 1) crisp values, 2) intervals
concept of variation of density or dense regions. They also and 3) fuzzy terms. It employed equal frequency partition
try to reduce the support-confidence conflict and eliminate with fuzzy concept to handle quantitative and categorical at-
useless, redundant rules. The simpler clustering techniques are tributes. For discovering frequent item-sets it uses an extended

128
Apriori algorithm. Gyenesei [21] assigns each quantitative for mining a smaller set of +ve and -ve QARs with better
attribute with several fuzzy sets (instead of sharp intervals) comprehensibility and lower cost of computation.
that characterize the attribute. Fuzzy support, fuzzy confidence 1) Discussion: Evolutionary algorithms often perform well
and fuzzy correlation are used as interestingness measures. in finding approximate solutions to all types of problems
Importantly, this method may suffer from anomalies if fuzzy using mechanisms inspired by biological evolution. The GA
sets are not well chosen. In an another attempt [22] numerical approach has the specialty to mine both positive and negative
attributes are converted to fuzzy binary attributes and employs QAR without discretizing attributes at prior. Hence, it takes
efficient thread based mechanism for mining the quantitative lesser database scans and also yields rules with optimized sup-
rules quickly. Recently, Zheng et al. [23] proposed a generic port and confidence. Multi-objective evolutionary techniques
Optimized Fuzzy Association Rule Mining (OFARM) method are result of recent research that try to reduce the cost of
which is easy to extend for continuous data. It optimizes the mining and optimize the number of positive and negative
partition points for fuzzy sets, where a multiple objective QARs generated without compromising rule interestingness.
function is used. A two-level iterative method is used to
generate association-rules. In addition to MinSup & MinConf, F. Other Approaches
this method uses one of the newer effective measures for
evaluation of association rules, called the Certainty Factor. There are several approaches (other than those which are
1) Discussion: In classical fuzzy techniques, linguitic ter- classified above) that try to solve quantitative association rule
minologies for defining partitions was the idea but the modern mining in their own ways. Cheng et al. in [27] highlight that
ones, convert the quantitative data into fuzzy sets and partition another combinatorial explosion problem may get triggered
points are used to divide neighbouring fuzzy sets. One problem by the task of combining the adjacent intervals of a quan-
with classical fuzzy QARM techniques is that they do not opti- titative attribute to increase support and detect meaningful
mize the selection of partition points and use Extended Apriori intervals. To address this combinatorial explosion problem
which takes higher exec-time as dimensions increases. But, using information-theoretic approach, they introduce the MIC
the modern fuzzy techniques take care of it. The fuzzy QARs (Mutual Information and Clique) framework to inspect the
using newer techniques are strong, positive and comprise at mutual information (MI) between every attribute pairs and
most two dimensions within antecedents and one dimension establish a MI Graph representing attributes having strong
in the consequent. However, these techniques may even suffer informative relationship w.r.t. a pre-defined threshold. Thus,
from the Support-Confidence conflict if the thresholds are not each frequent item-set is represented by a Clique present in
wisely chosen. the MI Graph. In this method, the attribute sets and their
corresponding intervals to be combined (or joined) can be
reduced effectively. Li et al. [28] extends a grid-based QAR
E. Evolutionary Approach
mining method using meta-rules that store data tuples in linked
The QAR mining problem is also treated with evolutionary lists where QAR mining is executed. In meta-rule guided
algorithmic (EA) approach by different researches in due association rule mining, some syntactic or semantic constraints
course. The most popular type of EA is the genetic algorithm are specified in the form of rules.
(GA) that has found utilization in QAR mining. Recent Nemmiche and Guillaume [29] introduce another different
research on QAR mining witnesses the use of multi-objective method for mining optimal positive and negative QARs. They
evolutionary algorithms too. Alataş et al. [6] proposes a genetic state that irrespective of the fact whether an AR is +ve or -ve;
algorithmic strategy for both positive and negative QAR the use of support and confidence as rule interestingness mea-
mining. With the help of 1) adaptive mutation probability, sures is not sufficient for optimal QAR generation. Therefore,
2) uniform operator and 3) an efficient fitness function; their in their method they used tables to summarize the trend of
method is capable to mine QARs without taking any thresholds variable interactions, thereby highlighting the zones that are
and without data preparation. QuantMiner [24] is a GA based interesting. Moreover, the method also introduces a new rule
approach that follows a set of rule templates while mining semantic of the form influential(s) → Influenceable(s) and an
rules. A template is nothing but a preset format of a QAR. impact measure Influence that analyses variable behavior and
It may be selected by the user or may be system computed. guarantees that rules of higher potential will be discovered
QuantMiner finds reasonable intervals in ARs dynamically, by discarding the irrelevant ones.
optimizing the support count and rule-confidence. 1) Discussion: The above techniques are applied on QAR
The QARM research also encompases multi-objective evo- mining without having much influence from the general ap-
lutionary algorithm. The ARM process can be considered proaches. They have their own unique way of dealing with
as a multi-objective problem instead of a single objective QAR mining problem. Most of these techniques strive to mini-
in which the rule evaluation measures may have different mize the many-rules generation and irrelevant-rules generation
objectives [4]. Kaya et al. [25] proposes two novel methods problem. The metarule guided technique tries to find multi-
to optimize QARs. They use three important criteria; support, dimensional rules and is linearly scalable to database size
confidence and amplitude as thresholds. Recently, Martin et and dimensions. The next technique deploying impact measure
al. [26] proposes a new multi-objective evolutionary approach usage has capability to mine both positive and negative QAR

129
TABLE I
S UMMARY OF DIFFERENT QAR M INING A PPROACHES

Rule Use Other


Approach Advantages Disadvantages Discret- No. of (-)ve
Dimen- Sup. & Thresh-
ization Scans Rules
sion Conf. olds
Information Loss,
Partitioning Easier to understand many rules, redundant Single 3 3 3 Multiple 7
[1], [7]–[10] and implement rules, Sup-Conf
Conflict

Reasonable Interval Often lacks


Clustering Generation (strive to High Dimensional Multiple 3 3 3 Multiple 7
[12]–[14] get best clusters as Scalability, Require
intervals) Many Thresholds

Statistical Larger Database Scans,


No Misleading Rules, Uneven Distributions Multiple 3 7 3 Multiple 7
[18], [19] Generates Sub Rules appear in Partitions

Interprete intervals as Require fuzzy thresholds,


Fuzzy [20], fuzzy terms, Discover Execution time increases Multiple 3 3 3 Multiple 7
[21], [23] rules having crisp in case of higher
values also dimensions

Mine +ve & -ve rules


without discretization, Comparatively difficult
Evolutionary Multiple 7 3 7 Multiple 3
Optimize Support and to implement, higher
[6], [25], [26]
Confidence, Maximise computational cost
rule comprehensibility

Reduces combinatorial
Information- explosion of intervals Not much effective Multiple 3 7 3 Multiple 7
Theoretic [27] that appear in most in smaller databases
partitioning techniques

which are potentially pertinent. However, most of these tech-


niques needs min-support and min-confidence thresholds to
find relevant rules and hence are prone to Support-Confidence
conflict. 18%
15%

9% 26%
IV. S UMMARY

11%
Different QAR mining approaches discussed above is sum-
marized in Table I and the proportion of contributions w.r.t 21%
the approaches is shown in Fig 2. We can observe from
Table I that every QAR mining technique has its pros and
cons. Despite several differences in the context of dealing
quantitative data, the use of support and confidence as rule-
interestingness measures is common throughout most of the
approaches. The use of newer rule evaluation measures (as Partitioning 26% Clustering 21%
stated in [4] and [5]) are also found in several modern QARM Fuzzy 11% Statistical 9%
techniques. Evolutionary 15% Others 18%
Table I also highlights that there are a few approaches Fig. 2. Proportion of contributions in various QARM approaches.
dealing with negative association rules and no technique is
capable of mining QAR in single database scan. However,
techniques from all approaches other than those from partition- V. C ONCLUSION
ing are capable of mining rules of more than one dimension. In
addition, discretization or data preparation is another common QARM techniques are essential for discovering knowledge
step for all techniques except the few using EA. Information from the real-world databases that contain high volume of
loss due to discretization is prominent using partitioning. quantitative and categorical data. In recent decades, different

130
QARM techniques found applications in various domains [30]. Application (ISDEA), 2012 Second International Conference on. IEEE,
Therefore, a large number of research works are contributed 2012, pp. 44–47.
[11] B.-C. Chien, Z.-L. Lin, and T.-P. Hong, “An efficient clustering algorithm
towards efficiently solving the problem of QAR mining. In for mining fuzzy quantitative association rules,” in IFSA World Congress
this work, we highlight various trends in QARM research and and 20th NAFIPS International Conference, 2001. Joint 9th, vol. 3.
present a comprehensive study on various approaches with IEEE, 2001, pp. 1306–1311.
their relative merits and limitations. To conclude it can be [12] W. Lian, D. W. Cheung, and S. Yiu, “An efficient algorithm for finding
stated that from the broad variety of techniques in existence dense regions for mining quantitative association rules,” Computers &
Mathematics with Applications, vol. 50, no. 3, pp. 471–490, 2005.
no particular technique seems to be suitable for application [13] Y. Guo, J. Yang, and Y. Huang, “An effective algorithm for mining
in all domains because the appearance, size and nature of quantitative association rules based on high dimension cluster,” in
data belonging to different domains vary. Moreover, current Wireless Communications, Networking and Mobile Computing, 2008.
WiCOM’08. 4th International Conference on. IEEE, 2008, pp. 1–4.
solutions may be considered inadequate because there are [14] Y. Junrui and Z. Feng, “An effective algorithm for mining quantitative
several issues pertaining to the available QARM techniques associations based on subspace clustering,” in Networking and Digital
which we identify and enlist below: Society (ICNDS), 2010 2nd International Conference on, vol. 1. IEEE,
2010, pp. 175–178.
1) Inability to mine both Positive and Negative QARs. [15] J. Han, J. Pei, and Y. Yin, “Mining frequent patterns without candidate
2) Failure to generate efficient multi-dimensional QARs. generation,” in ACM SIGMOD Record, vol. 29, no. 2. ACM, 2000, pp.
1–12.
3) Dependence on proper selection of thresholds for gen- [16] R. J. Miller and Y. Yang, “Association rules over interval data,” in ACM
erating rules. SIGMOD Record, vol. 26, no. 2. ACM, 1997, pp. 452–461.
4) Redundant, uninteresting and misleading rules genera- [17] Q. Tong, B. Yan, and Y. Zhou, “Mining quantitative association rules
on overlapped intervals,” in Advanced Data Mining and Applications.
tion. Springer, 2005, pp. 43–50.
5) Poor scalability of the mining technique w.r.t database [18] Y. Aumann and Y. Lindell, “A statistical theory for quantitative associ-
dimensions and volume. ation rules,” Journal of Intelligent Information Systems, vol. 20, no. 3,
pp. 255–283, 2003.
6) Large database scans to generate frequent itemsets and
[19] G.-M. Kang, Y.-S. Moon, H.-Y. Choi, and J. Kim, “Bipartition tech-
mine rules. niques for quantitative attributes in association rule mining,” in TENCON
7) High computational cost or execution time. 2009-2009 IEEE Region 10 Conference. IEEE, 2009, pp. 1–6.
[20] W. Zhang, “Mining fuzzy quantitative association rules,” in 2012 IEEE
Further works are required to address these issues and thus 24th International Conference on Tools with Artificial Intelligence.
bring in novel scopes for future researches in the QARM IEEE Computer Society, 1999, pp. 99–99.
scenario. [21] A. Gyenesei, “A fuzzy approach for mining quantitative association
rules.” Acta Cybern., vol. 15, no. 2, pp. 305–320, 2001.
[22] S. Prakash and R. Parvathi, “Qualitative approach for quantitative
R EFERENCES association rule mining using fuzzy rule set,” Journal of Computational
Information Systems, vol. 7, no. 6, pp. 1879–1885, 2011.
[1] R. Srikant and R. Agrawal, “Mining quantitative association rules in [23] H. Zheng, J. He, G. Huang, and Y. Zhang, “Optimized fuzzy association
large relational tables,” in ACM SIGMOD Record, vol. 25, no. 2. ACM, rule mining for quantitative data,” in Fuzzy Systems (FUZZ-IEEE), 2014
1996, pp. 1–12. IEEE International Conference on. IEEE, 2014, pp. 396–403.
[2] R. Agrawal, T. Imieliński, and A. Swami, “Mining association rules [24] A. Salleb-Aouissi, C. Vrain, and C. Nortet, “Quantminer: A genetic
between sets of items in large databases,” in ACM SIGMOD Record, algorithm for mining quantitative association rules.” in IJCAI, vol. 7,
vol. 22, no. 2. ACM, 1993, pp. 207–216. 2007.
[3] J. Han, M. Kamber, and J. Pei, Data mining, southeast asia edition:
[25] M. Kaya and R. Alhajj, “Novel approach to optimize quantitative
Concepts and techniques. Morgan kaufmann, 2006.
association rules by employing multi-objective genetic algorithm,” in
[4] M. Martı́nez-Ballesteros and J. Riquelme, “Analysis of measures of Innovations in Applied Artificial Intelligence. Springer, 2005, pp. 560–
quantitative association rules,” in Hybrid Artificial Intelligent Systems. 562.
Springer, 2011, pp. 319–326.
[26] D. Martin, A. Rosete, J. Alcalá-Fdez, and F. Herrera, “A new multiob-
[5] M. Martı́nez-Ballesteros, F. Martı́nez-Álvarez, A. Troncoso, and J. C.
jective evolutionary algorithm for mining a reduced set of interesting
Riquelme, “Selecting the best measures to discover quantitative associ-
positive and negative quantitative association rules,” Evolutionary Com-
ation rules,” Neurocomputing, vol. 126, pp. 3–14, 2014.
putation, IEEE Transactions on, vol. 18, no. 1, pp. 54–69, 2014.
[6] B. Alataş and E. Akin, “An efficient genetic algorithm for automated
[27] Y. Ke, J. Cheng, and W. Ng, “Mic framework: an information-theoretic
mining of both positive and negative quantitative association rules,” Soft
approach to quantitative association rule mining,” in Data Engineering,
Computing, vol. 10, no. 3, pp. 230–237, 2006.
2006. ICDE’06. Proceedings of the 22nd International Conference on.
[7] K. C. Chan and W.-H. Au, “An effective algorithm for mining inter- IEEE, 2006, pp. 112–112.
esting quantitative association rules,” in Proceedings of the 1997 ACM
[28] J. Li and X. Ye, “Study on linked list-based algorithm for metarule-
symposium on Applied computing. ACM, 1997, pp. 88–90.
guided mining of multidimensional quantitative association rules,” in
[8] T. Fukuda, Y. Morimoto, S. Morishita, and T. Tokuyama, “Mining
Natural Computation, 2007. ICNC 2007. Third International Conference
optimized association rules for numeric attributes,” in Proceedings of
on, vol. 1. IEEE, 2007, pp. 300–304.
the fifteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles
[29] L. N. Alachaher and S. Guillaume, “Variables interaction for mining
of database systems. ACM, 1996, pp. 182–191.
negative and positive quantitative association rules.” in ICTAI, 2006, pp.
[9] J. Li, H. Shen, and R. Topor, “An adaptive method of numerical
82–85.
attribute merging for quantitative association rule mining,” in Internet
[30] D. Adhikary and S. Roy, “Mining quantitative association rules in real-
Applications. Springer, 1999, pp. 41–50.
world databases: A review,” in Computing and Communication Systems
[10] L. Dancheng, Z. Ming, Z. Shuangshuang, and Z. Chen, “A new ap-
(I3CS), 2015 1st International Conference on, vol. 1. IGI Global, 2015,
proach of self-adaptive discretization to enhance the apriori quantitative
pp. 87–92.
association rule mining,” in Intelligent System Design and Engineering

131

View publication stats

You might also like