JaydwinLabiano - WhatData MIne

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

Pattern Recognition Letters 109 (2018) 120–128

Contents lists available at ScienceDirect

Pattern Recognition Letters


journal homepage: www.elsevier.com/locate/patrec

Review on mining data from multiple data sources


Ruili Wang a, Wanting Ji a,∗, Mingzhe Liu b, Xun Wang c, Jian Weng d, Song Deng e,
Suying Gao f, Chang-an Yuan g
a
Institute of Natural and Mathematical Sciences, Massey University, Room 2.14, Mathematical Sciences Building, Auckland 0632, New Zealand
b
School of Network Security, Chengdu University of Technology, Chengdu 610059, China
c
School of Computer and Information Engineering, Zhejiang Gongshang University, Hangzhou 310018, China
d
College of Cyber Security, Jian University, Guangzhou 519632, China
e
School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing 210043, China
f
School of Economics and Management, Hebei University of Technology, Tianjin 300401, China
g
School of Computer and Information Engineering, Guangxi Teachers Education University, Nanning 530023, China

a r t i c l e i n f o a b s t r a c t

Article history: In this paper, we review recent progresses in the area of mining data from multiple data sources. The
Available online 31 January 2018 advancement of information communication technology has generated a large amount of data from dif-
ferent sources, which may be stored in different geological locations. Mining data from multiple data
Keywords:
sources to extract useful information is considered to be a very challenging task in the field of data min-
Multiple data source mining
Pattern analysis
ing, especially in the current big data era. The methods of mining multiple data sources can be divided
Data classification mainly into four groups: (i) pattern analysis, (ii) multiple data source classification, (iii) multiple data
Data clustering source clustering, and (iv) multiple data source fusion. The main purpose of this review is to systemat-
Data fusion ically explore the ideas behind current multiple data source mining methods and to consolidate recent
research results in this field.
© 2018 Elsevier B.V. All rights reserved.

1. Introduction cheats that yielded nearly $10 billion in 2016 [4]. For example, in
a normal Australian family, the husband has a business and re-
The advancement of information communication technology ported $80,0 0 0 of taxable income per year, putting him just in-
has generated a large amount of data from different sources, which side the second-lowest tax bracket, and his wife reported earning
may be stored in different geological locations. Each database may $60,0 0 0 per year. But the data collected from different data sources
have its own structure to store data. Mining multiple data sources revealed that the family had three children at private schools at
[1–3] distributed at different geological locations to discover use- an estimated cost of $75,0 0 0 per year, while immigration records
ful patterns are critical important for decision making. In particu- and social media posts showed that the family had recently taken
lar, the Internet can be seen as a large, distributed data repository five business-class flights and a holiday in a Canadian ski resort,
consisting of a variety of data sources and formats, which can pro- Whistler. It means their declared incomes did not match their
vide abundant information and knowledge. lifestyle. This prompted ATO to contact them to confirm if they
Data from different sources may seem irrelevant to each other. have unpaid taxes. From the above example, we can see that devel-
Once information generated from different sources is integrated, oping an effective data mining technique for mining from multiple
new and useful knowledge may emerge. Here is an excellent ex- data sources to discover useful information is crucially important
ample of how an organization to utilize mining data from different for decision making.
data sources to obtain profound information, which cannot obtain However, how to efficiently mine quality information from mul-
from an individual source. tiple data sources is a challenging task for current research [5–9],
The Australian Taxation Office (ATO) mines data from differ- especially in the current big data era, because in real world appli-
ent data sources such as social media posts, private school records cations, data stored in multiple places often have conflictions [10].
and immigration data to detect tax cheats. Mining data from dif- The conflictions include: (i) data name conflictions: (a) the same
ferent data sources become a sophisticated tool to crackdown tax object has different names in different data sources, or (b) two
different objects from different data sources may have the same
name; (ii) data format conflictions: the same object in different

Corresponding author. data sources has different data types, domains, scales, and preci-
E-mail address: jwt@escience.cn (W. Ji).

https://doi.org/10.1016/j.patrec.2018.01.013
0167-8655/© 2018 Elsevier B.V. All rights reserved.
R. Wang et al. / Pattern Recognition Letters 109 (2018) 120–128 121

sions; (iii) data value confliction: the same object in different data presents an association rule between bread and milk) is 10%. If 40%
sources records different values; (iv) data sources confliction: dif- of customers who bought bread will also buy milk, then the confi-
ferent data sources have different database structures. dence of bread → milk is 40%. If there are totally 200 customers
In order to overcome these conflictions, four effective ap- bought milk in the given day, then the expected confidence of
proaches have been adopted to mine useful information and dis- bread → milk is 20%. Thus, the lift of bread → milk is 40%/20% = 2.
cover new knowledge from multiple data sources: (i) pattern anal- While the support measures the importance of an association
ysis [11–14], which mining useful patterns and information from rule, the confidence measures the accuracy of an association rule.
one data source or several data sources in accordance with chang- The support can reflect the representativeness of an association
ing conditions constraints or relationships; (ii) multiple data source rule. Although some association rules have high confidences, their
classification, which labels data sources according to a certain stan- supports are very low, which means these association rules have
dard, then classifies them by their labels; (iii) multiple data source little chance of being practical so that they are also unimportant.
clustering, which clusters data sources according to their similar- The expected confidence of an association rule can be seen as
ities/distances; (iv) multiple data source fusion, which combines the support of an itemset without the influence of other itemsets,
data from multiple data sources to achieve higher accuracy and while the lift of an association rule is the ratio of confidence to
more specific ratiocinations. Based on these three approaches, we expected confidence, which is used to describe how one itemset
can mine useful information from multiple data sources to discover affects another itemset. In general, the lift of a useful association
new knowledge according to individual needs. rule should be greater than 1. Only when the confidence of an as-
The main purpose of this review is to systematically overview sociation rule is greater than its expected confidence, it shows a
current multiple source data mining methods and to consolidate correlation between these two itemsets (i.e., one itemset has a pro-
recent research results in this field. In this paper, the pattern anal- moting effect on another itemset), otherwise, this association rule
ysis approaches for multiple data sources are reviewed first, and is meaningless.
the approach of multiple data source classification and clustering According to the definition above, any two itemsets in a trans-
is reviewed after that. Then, mining multiple data sources using action can have an association rule between them. In fact, users
data fusion is reviewed. The rest of this paper is organized as are only interested in the association rules that satisfy a certain
follows. Section 2 shows typical methods of multiple data source support and confidence. Therefore, a support-confidence frame-
mining using pattern analysis, while Section 3 is about multiple work is used to mine the correlations between itemsets in the lo-
data source classification and clustering. Section 4 describes meth- cal databases. Two thresholds, the minimum support and the min-
ods of multiple data source fusion. Finally, Section 5 provides con- imum confidence, specify the minimum requirements of a mean-
clusions and discussions. ingful association rule.
Association rule mining consists of two phases: (i) In the first
2. Key methods for pattern analysis phase, all the high frequent itemsets are identified. (ii) In the sec-
ond phase, association rules are generated from the high frequent
In this section, we will describe different methods of pattern itemsets. It has been extensively studied to identify the relation
analysis on multiple data sources. Pattern analysis on multiple data existing in databases between itemsets [26].
sources is a process of mining valuable information or extract- A classical algorithm of association rule mining named Apri-
ing hidden predictive information from different databases. Pat- ori algorithm is presented by Agrawal and Srikant [27], which is
terns existing in multiple data sources can be categorized as lo- used for mining frequent itemsets and relevant association rules.
cal patterns and global patterns. A pattern identified from a mono- A key concept in the Apriori algorithm is the anti-monotonicity of
database can be seen as a local pattern, while a global pattern has the support measure. It assumes that: (i) All subsets of a frequent
to be determined based on all the obtained local patterns from itemset must be frequent. (ii) For any infrequent itemset, all its su-
multiple data sources [15–17]. Correspondingly, two mainstream persets must be infrequent. The algorithm can be divided into two
methodologies of pattern analysis are developed, named local pat- phases. In the first phase, it applies minimum support to find all
tern analysis and global pattern analysis, which will be discussed the frequent sets with k items in a database. In the second phase, it
as follows. uses the self-join rule to find the frequent sets of k + 1 items with
the help of frequent k-itemsets. Repeat this process from k = 1 to
2.1. Local pattern analysis the point when we are unable to apply the self-join rule. The Apri-
ori algorithm can find the relationship between items in a database
Local patterns analysis is a process of identifying patterns from and it is the first efficient and practical algorithm for association
a mono-database, which can efficiently extract knowledge from rule mining. But it needs to generate a large number of candidate
the mono-database [18]. Based on previous studies [19–25], we itemsets and repeatedly scan the entire database.
can summarize the local pattern analysis methods into three cate- A Frequent-Pattern tree structure called FP-tree is presented by
gories: association rule mining, sequential pattern mining, and oth- Han et al. in [28], which is constructed by (i) Scan a database to
ers. get all the frequent itemsets and their respective support in the
database, sorting frequent items in frequent itemsets in a descend-
2.1.1. Association rule mining ing order of their support values. (ii) Create the root node and
Association rule mining is a process of discovering the probabil- scan the database again. Then select the frequent items and sort
ity of the co-occurrence of items in a database. Association rules them in the order of frequent itemsets. It is used for collecting
express the relationships between co-occurring items, which are compacted and important information about frequent patterns. Ac-
often used to analyze sales transactions. Association rule mining is cording to FP-tree, a corresponding method called frequent-pattern
also known as market basket analysis. There are four parameters growth algorithm is proposed for mining the complete set of fre-
used to describe an association rule: support, confidence, expected quent patterns [28]. The proposed method can efficiently avoid the
confidence, and lift. process of candidate generation-and-test, and it can cut down the
For example, a transaction database of a supermarket records time cost of frequent pattern mining. However, the algorithm may
the number of customers on a given day is 10 0 0. Among them, have a challenge when encountering graph-based patterns or noise
there are 100 customers bought both bread and milk, the support samples.
of bread → milk (bread and milk are two itemsets, bread → milk
122 R. Wang et al. / Pattern Recognition Letters 109 (2018) 120–128

Different from traditional association rule mining and inspired identifying the sequential relationships between different itemsets.
by the paradigm named Ratio Rule (RR), which is used for ac- There is no doubt that studying sequential patterns is always an
quiring the quantitative association knowledge, Yan et al. [29] pro- active area of data mining [32].
posed a method that uses a robust and adaptive one-pass method Agrawal and Srikant [32] first introduced the concept of se-
to mine ratio rules from a database independently. Then all the ob- quential pattern mining and proposed two similar algorithms
tained ratio rules will be integrated into a probabilistic model. The based on the Apriori algorithm for mining sequential patterns,
method is called Robust and Adaptive Ratio Rule (RARR). The main called AprioriAll and AprioriSome. The only difference between
notion of RARR is converting the ratio rule problem into the min- them is that AprioriAll counts all the large sequences including
imization problem of a non-negative energy function. The greatest non-maximal sequences while AprioriSome only counts the maxi-
advantage of RARR is robustness in dealing with outliers while tra- mal sequences. Experimental results show that both AprioriAll and
ditional multiple-source ratio rule mining algorithms cannot solve. AprioriSome have the excellent scale-up properties because they
Besides, it is adaptive to mine the rules for a dynamic data source. scale linearly as the number of itemsets or transactions. However,
A least association rule means that the association rule consists in the process of candidate sequences generation, when the num-
of the least item, which is very important and critical when it is ber of candidate sequences is too large or the length of a can-
used to find the infrequent or abnormal itemsets. Abdullah et al. didate sequence is too long, the memory space required for both
[30] proposed a method called Critical Relative Support (CRS) to proposed algorithms will become much huge.
mine the significant least association rules, which satisfies a cer- A novel sequential pattern mining method, called PrefixSpan
tain predefined minimum critical threshold value. The experimen- (i.e., Prefix-projected Sequential pattern mining) is proposed by
tal dataset is from the examination results of computer science Han et al. [33], which aims to reduce the cost of candidate sub-
students from University Malaysia Terengganu. CRS has been veri- sequence generation and the size of projected databases. The
fied efficiently for discovering the significant least association rules proposed method only examines the prefix subsequences rather
and can substantially decrease the number of unwanted rules. But than the whole sequences. Then only the postfix subsequences
the performance of CRS has only tested on one educational dataset, of these prefix subsequences corresponding to will be projected
which should be tested by a variety of different real datasets. into the projected databases. In each projected database, the pro-
In general, association rule mining uses the support-confidence posed method only mines local frequent patterns to obtain sequen-
framework to find the rules of users’ interests. However, the tial patterns. Moreover, in order to improve mining efficiency, two
support-confidence framework cannot reflect the semantic mea- kinds of database projections, named level-by-level projection and
sure among the items. Semantic measure is characterized by utility bi-level projection, are explored in [33]. Besides, a main-memory-
values, which are associated with transaction items typically. The based pseudo-projection technique is developed for saving the cost
items that cannot be measured directly by a decision maker such of projection and speeding up computation processing.
as stock, cost or profit, are called utility. For example, the confi- By identifying potential trends in data, approximate sequence
dence of an association rule represents the percentage that shows pattern can summarizes and represents the local database effec-
how frequently the rule occurs among all rules. tively. An alternative local pattern analysis method for sequential
High utility itemsets mining can be seen as an extension of fre- patterns mining based on a local database of a multi-database is
quent itemset mining [31], which includes three requirements: (i) proposed by Kum et al. [34]. In order to mine approximate se-
The item appears more than once. (ii) The item has its own profits. quential patterns, called consensus patterns, the proposed method
(iii) The mining goal is to find the items which have high profits. presented a novel algorithm, called ApproxMAP (for Approximate
High utility itemsets mining will be more valuable than a single Multiple Alignment Pattern mining). The processing of ApproxMAP
support-confidence framework and can permit with more accurate includes two phases. In the first phase, all sequences are clustered
financial analysis and decisions. according to their similarity/distance. In the second phase, con-
A new utility-confidence framework based association rule min- sensus patterns are mined from each cluster through ApproxMAP
ing method is proposed by Sahoo et al. in [31], which is also the directly. The experimental results show that ApproxMAP can ef-
first research on mining non-redundant association rules extracted ficiently summarize a local database and reduce the cost for the
from the high utility itemsets. It includes two-step process: (i) following global mining.
find all high utility itemsets and (ii) extract all valid rules. A com- An efficient sequential pattern mining method based on item
pressed representation for association rules is generated with the co-occurrences is proposed by Fournier-Viger et al. [35] for can-
help of high utility closed itemsets (HUCI) and their generators. didate sequences pruning. It proposed a data structure called
The compressed representation is named as high utility generic ba- Co-occurrence MAP (CMAP) for storing co-occurrence informa-
sis (HGB), which contains rules with minimal antecedent and max- tion. Then the CMAP is used to prune candidates in three verti-
imal consequent. After that, in order to extract the non-redundant cal algorithms (SPADE (Sequential PAttern Discovery using Equiv-
rule set, HUCI and their generator are mined by the proposed al- alence classes) [36], SPAM (Sequential PAttern Mining) [37] and
gorithm HUCI-Miner that identifies HUCI and associates the high ClaSP (Closed Sequential Patterns algorithm) [38]). Extensive ex-
utility generators to the corresponding closed itemset. Moreover, perimental results based on real-world datasets show that the co-
an inference algorithm is proposed to derive all valid association occurrence-based pruning proposed in [35] is effective and CMAP
rules under this framework. Experimental results show a signif- can reduce a substantial amount of time for candidate sequence
icantly better achievement in compactness of the proposed rule selecting.
mining algorithms. Recently, in order to deal with complex real-world data mining
problems, how to mining frequent patterns with periodic wildcard
2.1.2. Sequential pattern mining gaps has gained much research interest. A frequent pattern with
As indicated in [32], a sequence is an ordered list of itemsets. periodic wildcard gaps is to add gap constraints to normal frequent
For example, in customer transactions, an itemset corresponds to patterns. It requires that the user-specified gap constraints be sat-
a transaction, and a sequence corresponds to a list of transactions isfied between any two of frequent patterns. The contents between
sorted by transaction time. The goal of sequential patterns min- the gaps can be any value. Usually, it is a predetermined symbol,
ing is to identify the maximal sequences satisfied a pre-specified which is referred to as a wildcard collectively. Frequent patterns
minimum support, and the maximal sequence can be considered with periodic wildcard gaps can reflect the common characteristics
as a sequential pattern, which is a very important pattern used for of items in a certain gap. User-specified patterns will be mined di-
R. Wang et al. / Pattern Recognition Letters 109 (2018) 120–128 123

rectly without a large number of meaningless patterns generated is supported by most of databases and has larger chance to be-
at the same time. come useful rules in the union of all databases. A weighting model
Previous methods usually use matrices or other linear data is proposed by Wu and Zhang [19] for synthesizing high-frequent
structures for frequent patterns with periodic wildcard gaps min- association rules from multiple databases. In order to extract high-
ing, which consume a large amount of memory and running time. frequent association rules from multiple databases, the proposed
An Incomplete Nettree structure is proposed by Wu et al. [39], method constructed a new association rule selection method to en-
which is used to calculate the number of supports of its super- hance the proposed weighting model. In the rule selection method,
patterns in a one-way scan. After that, two novel algorithms, high-frequent association rules will be taken as relevant rules,
named MAPB (Mining sequentiAl Pattern using incomplete Net- while low-frequent rules will be taken as irrelevant rules. More-
tree with Breadth first search) and MAPD (Mining sequentiAl Pat- over, when mining association rules from unknown data sources,
tern using incomplete Nettree with Depth first search), are used to a relative synthesizing model is constructed to synthesize these
solve the problem above with low memory requirements. More- association rules by clustering of local association rules. However,
over, a heuristic algorithm called MAPBOK (MAPB for top-K) is the proposed weighting model has a limitation, which needs to be
proposed to process the Top-K frequent patterns in each length built on the databases that have the same type of data sources.
[39]. Experimental results based on a real-world biological dataset For example, if a company with several branches has three types
demonstrate the superiority of the proposed algorithms in running of businesses: super markets, banks and investment trusts, all data
time and memory consumption. sources need to be classified into three classes according to the
business types, then the data sources in each class can be synthe-
2.1.3. Other pattern mining sized by using the proposed weighting model.
Yan and Han [40] proposed a method called graph-based Sub- In order to overcome the limitation of the above proposed
structure pattern mining (gSpan) for frequent graph-based pattern weighting model, an improved weighting model is proposed by
mining. A lexicographic order is constructed by gSpan firstly. Then Ramkumar and Srinivasan [43] for synthesizing high-frequency
gSpan mines frequent substructure patterns without the process of rules. A data source weight is calculated based on its transaction
candidate generation by using depth-first search strategy. gSpan is population (i.e., the number of itemsets) of this data source. Based
the first algorithm to apply depth-first search in the field of fre- on the weights of data sources and the supports of local databases,
quent sub-graph mining. It also can be applied to mining large fre- the proposed methods can calculate the global support and global
quent subgraphs. confidence. Experimental results show that the global support and
For a multi-branch company, how to make decision based confidence calculated by the improved weighting method is much
on a branch’s patterns is an important problem. Suganthi and close to the mono-mining (i.e., putting all databases into one big
Kamalakannan [41] proposed a new clustering method for ex- database to mine patterns) results.
act synthesized frequent items and a new method to deal with Global pattern analysis intends to find a synthesized rule from
exceptional patterns. The first one is used to provide detailed local patterns in multiple databases to match with the mono-
information of the associations among different types of items in mining results closely. Some models such as weighting model and
a market. The second one is used to process the transportation minimum supporting principle are based on linear combination.
cost of a company, manpower and non-profitable items. Experi- However, when the extracted patterns are not satisfying a mini-
mental results show that when the items in a database are greatly mum support threshold value, they cannot be used for the pat-
associated, the proposed two methods work well. tern synthesizing process. In some cases, this requirement may
ignore some useful information/patterns that are meaningful for
2.2. Global pattern analysis a right decision making. Thus, an improved probabilistic model
based on the maximum entropy method is proposed for the syn-
Local pattern analysis is to obtain frequent patterns form local thesizing process by Khiat et al. [44]. Both clique method (a grid-
databases. Then these obtained frequent patterns will be synthe- based spatial clustering algorithm) and bucket elimination tech-
sized for global pattern analysis. The key of global pattern analysis nique [44] are used for reducing the complexity of the maximum
is how to set up synthesizing models. entropy method. The experiments are based on the frequent item-
In the process of global pattern analysis, in addition to synthe- sets which are discovered by the Apriori algorithm, and the results
sizing the support and confidence of an association rule, the fre- show the efficiency of the proposed synthesizing method [44].
quency of an association rule is also an important pattern. The Zhang et al. [42] propose a method for synthesizing local pat-
frequency of an association rule is the number of data sources terns in multiple databases, which is especially suitable for finding
which contain this association rule. According to a threshold set- global exceptional patterns. It considers that different data sources
ting, an association rule can be high-frequent, low-frequent, or nei- should have different minimal supports due to they are comprised
ther high-frequent nor low-frequent in different data sources [45]. of different databases. Therefore, it adopts a method of comput-
Heavy association rules, high-frequent association rules and ex- ing a pattern’s importance by measuring the distance between a
ceptional association rules are three specific types of association pattern’s support and the corresponding database’s minimal sup-
rules for synthesizing global patterns from local patterns in dif- port. However, the proposed method does not consider that dif-
ferent data sources [45]. A heavy association rule is an associa- ferent data sources may have different effects in the synthesizing
tion rule with high-support and high-confidence in multiple data process, which may exist in reality [42].
sources. A high-frequent association rule is an association rule Different from many other approaches that do not consider the
with high extraction in multiple data sources. An exceptional as- relationship between the data in one database, a nonlinear method
sociation rule is a heavy association rule but low-frequent. A high- is proposed by Zhang et al. [20] to find the nonlinear relations
frequent association rule is not necessarily a heavy association of data in a database. The proposed method is called Kernel Esti-
rule. Based on these definitions, two algorithms are proposed by mation for Mining Global Patterns (KEMGP). The proposed method
Adhikari and Rao [45]. The proposed methods synthesized heavy uses the kernel estimation method (a mapping method used Gaus-
association rules in multiple databases and notified whether a sian kernel) to capture the patterns of nonlinear relation when
heavy association rule is high-frequent or exceptional. synthesizing local patterns for global patterns. The KEMGP can bet-
In practice, high-frequent association rules are important for ter characterize that different databases have different contribu-
decision making. A high-frequent association rule means the rule tions to the global pattern.
124 R. Wang et al. / Pattern Recognition Letters 109 (2018) 120–128

2.3. Others In order to analyze the effect of database grouping on multi-


database mining, an approach of non-local pattern analysis (NLPA),
Besides, other techniques such as compressed sensing [46] and which combined database grouping algorithm and PFT, is proposed
sparse learning [47] also are widely used in pattern analysis. Since in [51]. Non-local pattern analysis is more attractive than LPA
a signal can be expressed in Fourier series or Taylor series, a new when the frequency of data mining is lower, which is helpful to
input signal can be reconstructed by several known signals. estimate a lesser number of patterns when synthesizing the global
The compressed sensing theory [46] considers that, as long as pattern. Experimental results show the disadvantage of non-local
an input signal is compressible or sparse in a transform domain, pattern analysis is that the pattern reported only from the last few
by using an observation matrix, which is not related to the trans- groups which may result in the experimental error worse. This is
form base, the transformed high-dimensional input signal can be due to the absence of a pattern reported which has been assumed
projected onto a low-dimensional space. Then the input signal can to exist. Thus, the definition of the depth of a pattern is proposed
be reconstructed by solving an optimization problem from a small to solve this drawback. It means a pattern will become useless if
number of projections with a high probability. its depth is low.
The purpose of signal sparse representation is to represent an
input signal with as few known signals (i.e., atoms in a given over-
3. Key methods for data source classification and clustering
complete dictionary) as possible. A more concise representation of
the input signal will be obtained so that the information contained
In real world applications, a large amount of data from differ-
in the input signal can be used in the further processing like com-
ent sources normally is stored in different geological locations. In
pression and encoding. In pattern analysis, the sparse representa-
order to reduce data mining costs, these data can be classified or
tion is used as follows.
clustered before data mining. Based on the outcomes of clustering
In the process of global and local learning, minimizing both
or classification, data sources relevant to a given data source min-
global and local learning cost functions is much helpful to utilize
ing application can be identified easily. This step is referred as data
both local and global discriminative information of a target do-
source selection.
main [48]. Based on this premise, a novel Multi-source Adaptation
learning method with Global and Local Regularization (MAGLR) is
proposed by Tao et al. [48] for visual classification tasks by ex- 3.1. Data source classification
ploiting multi-source adaptation regularization joint kernel sparse
representation (ARJKSR). ARJKSR uses target and source domains to Data source classification [52] is a preprocess for multiple data
reconstruct a dataset by sparse representation. Based on the mixed source mining, which is to classify given data sources by compar-
Laplacian regularization (i.e., global sparsity preserving Laplacian ing their features and measuring the relevance between different
regularization and local learning Laplacian regularization) semi- data sources using a classifier [53]. Normally, the performance of
supervised learning diagram, MAGLR can obtain a label prediction the classifier is measured in three aspects: (i) accuracy, (ii) com-
matrix for unlabeled instances from the target domain. The pro- putational complexity, and (iii) the simplicity of the model descrip-
posed methods considered correlations among different source do- tions.
mains and applies sparse representation to multiple source do- Wu et al. [53] propose an application-independent multiple
mains including local and global regularizations simultaneously. data source classification method, which is called BestClassifica-
However, how to select the optimal parameters for the multi- tion. This method has two phases. In the first phase, it divides
source domain adaptation settings has not been solved, which is all data sources by using the proposed GreedyClass algorithm. The
still an open problem. data sources are divided into several smaller sets according to their
A multiple measurement vector (MMV) model is used to learn similarity/distances (the similarity of two databases is measure by
the properties of the distributions of the sources in the recovery the number of the common itemsets that both database have di-
processing and improve the recovery of the sources in [49]. By vided by the total itemsets of both datasets) when a threshold is
using a mixture of Gaussians prior, which is inspired by existing assigned. In the second phase, it finds a best classification for these
Sparse Bayesian Learning approaches, the proposed method is used data sources by using the proposed BestClassification algorithm.
for simultaneous sparse approximation. The proposed method has If there is no such the best classification for these data sources,
been tested on a variety of synthetic, simulated and real data with a trivial best classification is produced. The best classification in
good performance. The proposed method assumes that there is no this paper means that all data sources can be classified properly.
prior knowledge about the sources besides the number of compo- However, since the time complexity of GreedyClass is O(n4 ), when
nents in the Mixture of Gaussians distribution. In some cases, im- n (the number of databases) is very large, the time complexity
proved performance can be achieved by taking further knowledge will become very large. At the same time, BestClassification does
of the prior into account. not always obtain the best classification from the multiple data
In practice, global patterns in multiple databases are synthe- sources.
sized according to the independent characteristics of local pat- An improved data classification algorithm for multiple data
terns. The pipelined feedback technique (PFT) [50] is one of the source classification is proposed by Li et al. [54]. The new algo-
most effective techniques in local pattern analysis (LPA) for this rithm, named as CompleteClass, has less time complexity compar-
task. Before PFT is proposed, the Single Database Mining Technique ing with GreedyClass in [53] since CompleteClass sets a smaller
(SDMT) was used to mine local databases in multiple databases. similarity threshold and decreases the number of data sources in
However, SDMT has a limitation related to sizes of databases: (i) a class. The complexity of CompleteClass is O(n3 ), where n is the
when each of the local databases is small, SDMT can mine the number of databases. BestCompleteClass, an improve version of
union of all databases; (ii) when at least one of the local databases BestClassification, can always achieve the complete classification
is large, SDMT can mine every local database, but fail to mine the (i.e., the best classification) for multiple data sources. Theoretical
union of all local databases; (iii) when at least one of the local analysis and experimental results verify the effectiveness of the al-
databases is very large, SDMT fails to mine every local database gorithm. However, the running time of this method is still long
[50]. In PFT, once a pattern is extracted from a local database, then when dealing with a large amount of data.
it will be extracted from the remaining databases. PFT improves Based on these two classification algorithms, a new multiple
synthesized patterns as well as local patterns analysis significantly. data source classification method is presented by Dingrong et al.
R. Wang et al. / Pattern Recognition Letters 109 (2018) 120–128 125

[55], which is called application-independent database classifica- Table 1


Multiple source data fusion methods in different fusion levels.
tion. It can achieve higher intra-class cohesion and lower inter-
class coupling compared with BestClassification and BestComplete- Pixel-level Feature-level Decision-level
Class. In order to search for the best complete classification, there Algebraic method Bayesian Knowledge-based fusion
are three phases in this new algorithm. In the first phase, it HIS transform Dempster_shafer Dempster_shafer
computes the similarity between each data source. In the second High – pass filtering Entropy method Fuzzy set theory
phase, it chooses several databases from all data sources and clas- Regression model Weighted average Reliability theory
Best variable substitution Neural network Bayesian
sifies them into two classes based on their similarities. In the final
Kalman filter Clustering Neural network
phase, it classifies other databases from all data sources into the Wavelet transform Voting Logical template
given classes in the second phase. Then it will inspect whether the
classification satisfy the given condition, if it does then end the
classification, otherwise recursive calls for further classification. To
ing rather than single data source domains. Thus, it has a higher
prove the efficiency of the new algorithm, Dingrong et al. com-
data mining accuracy and data mining speed. However, for multi-
pares the new method with the methods in [54] and [53]. The ex-
ple high-dimensional big data sources, the classifying and mining
perimental results show the new method uses less time cost to
efficiency of this method still need to be improved.
achieve the best complete classification for any kind of multiple
data sources.
4. Key methods for data source fusion
3.2. Data source clustering
Multiple data source fusion is a process of combining data from
Data source clustering is another data preprocess for multiple multiple data sources to achieve higher accuracy and more specific
data source mining. Different from data source classification, it is ratiocinations. Multiple data source fusion has been used in many
a kind of unguided learning. In other words, data clustering is areas, including image processing (i.e., multi-sensor data fusion).
to cluster data according to the similarity between data without Multi-sensor data fusion is an intelligent synthesis process of
knowing the classes they belong to in advance. remote sensing image data from multiple sources, and to generate
A simple method for multiple data source clustering to achieve accurate, complete and reliable estimation and judgment [59]. The
high cohesion and low coupling is proposed by Tang and Zhang process is to collect the information from different sensors firstly,
[56]. Similar to the proposed method in [53], it clusters data and then to eliminate the data redundancy and data conflictions
sources based on the similarity between each data source. The that may exist among the multi-sensor information. Multi-sensor
proposed method firstly constructs a multi-objective optimization data fusion can improve the timeliness and reliability of remote
problem, then uses a hierarchical clustering algorithm, which clus- sensing information. This is very useful for decision making, plan-
ters multiple data sources based on the similarity between each ning, and verification [60]. The main application areas of data fu-
data sources. The hierarchical clustering algorithm can avoid in- sion include multiple data source image composite [61], robot and
stability and reduce time complexity, to find the optimal cluster intelligent instrument system [62], battlefield and unmanned air-
structure. The multi-objective optimization problem is to find a set craft [63], image analysis and understanding, target detection and
of decision variables, which satisfy constraints as well as all objec- tracking [64], and automatic target recognition [65].
tive functions and function values for each target (Pareto optimal There are three types of data fusion methods for remote sens-
solutions), and provide the decision variables to a decision maker. ing images: (i) at the pixel level, (ii) at the feature level, and (iii) at
The best cluster is evaluated by the cophenetic correlation coeffi- decision level. The level of fusion is in an order from low to high.
cient. Comparing with the BestClassification algorithm, the experi- The process of multi-sensor data fusion has four steps. In the first
mental results show that this clustering method has stronger sta- step, remote sensing image data from multiple data sources. The
bility, lower time complexity and stronger generalization ability. second step is to do feature extraction. Pixel, feature or decision
For big data, Xin et al. [57] provide two novel algorithms for level fusion is used in the third step. The last step it to do fusion
multiple data source clustering: a multi-source cross-domain clas- attribute description. Table 1 lists several multiple source data fu-
sification (MSCC) algorithm, which is based on the existing max- sion methods in different fusion levels.
imum consistency function model and Newton gradient descent
method, and the MSCC Dual coordinate descent method, which is 4.1. Pixel-level multi-sensor data fusion
suitable to deal with large dataset.
The MSCC algorithm is proposed to implement efficient cross- Pixel-level fusion is a low level of data fusion. The advantages
domain classification of target domains using a logistic regression of pixel-level fusion are to retain as much information as possi-
model and a proposed consensus measure. After that, in order to ble. This fusion approach has the highest precision. The main pixel-
use MSCC to deal with big data in the field of multiple data source level fusion methods include Principal Component Transformation
mining, a MSCC’s fast version MSCC-CDdual is derived and theo- (PCT) [66], HIS transform [68], and wavelet transform [68].
retically analyzed. MSCC-CDdual is based on the algorithm CDdual Principal Component Transformation (PCT), which is also called
(Dual coordinate descent method) [58] as the recent advance on Karhunen–Loeve Transform (K–L transform), is a multidimensional
large-scale logistic regression which can achieve a fast speed, high linear transformation based on statistical features of an image. It
classification accuracy and good domain adaption for multiple big has the effect of variance information enrichment and data com-
data source mining. pression, which can more accurately reveal the remote sensing in-
In [57], it proposed a new coherence function model, which formation within the multi-band data structure [66]. The process
makes it easy to evolve the MSCC algorithm into a fast algorithm of remote sensing data fusion by using PCT has four steps [67].
for large samples. MSCC can reduce the error rate of the classifier The first step uses the input of multi-band remote sensing data for
trained from a single source domain and improve the accuracy of spatial registration and principal component transformation. The
comprehensive classification. A new fast algorithm MSCC-CDdual second step uses the results of spatial registration and principal
shows its operating speed advantage from the perspective of high- component transformation for histogram matching. The third step
dimensional data in [57]. Unlike the existing cross-domain algo- uses the results of histogram matching to replace the fusion pro-
rithms, MSCC is oriented to multiple data source domain learn- cess components in the first principal multi-source remote sensing
126 R. Wang et al. / Pattern Recognition Letters 109 (2018) 120–128

data. The last step generates a multi-band fused image with high cessing of the entire image. After that, the proposed method will
spatial resolution by using principal component inverse conversion. select the corresponding segmentation areas from the source im-
Due to the physical constraints between spatial and spectral age, respectively, and extracts features which represent the clarity
resolutions, poor-resolution multispectral (MS) data is desirable. of the corresponding regions. Then the extracted features will be
Thus, a high-resolution panchromatic observation needs to be pro- feed into a neural network to determine a clear region to recon-
cessed before multispectral image fusion. Two general and formal struct the final fusion image. The experimental results show that
solutions, which are based on undecimated discrete wavelet trans- the effectiveness of this new fusion method was superior to previ-
form and generalized Laplacian pyramid, respectively, are proposed ous methods.
by Aiazzi et al. [69]. Undecimated discrete wavelet transform is an A new scalable feature-level sensor fusion architecture is pro-
octave band pass representation, which is to omit all decimators posed by Kaempchen et al. [73], which combines the data from a
and upsample the wavelet filter bank from a conventional discrete multi-layer laser scanner with a monocular video to increase the
wavelet transform. Generalized Laplacian pyramid is another over- functionality of driver assistance systems. In order to enhance the
sampled structure, which is obtained by recursively subtracting an precision of object tracking, the proposed method established a
expanded decimated lowpass version from an image. The proposed new geometric object model for a large diversity of object shapes
solutions based on these two methods above are to selectively per- recognition. Besides, a new free form object tracking approach is
form spatial-frequency spectrum substitution from one image to used for a precise velocity estimation of trucks on highways. The
another. Different from other multiscale fusion methods, the ad- proposed method is designed to maximize synergetic by combin-
vantages of the proposed methods include: (i) Context dependency ing feature-level measurement features and keeping the fusion ar-
in the proposed solutions is achieved by thresholding local corre- chitecture as general as possible. Experimental results in an ac-
lation coefficient between the images to be merged. This can avoid tual traffic scene show that the proposed method outperforms the
some spatial details which are not likely occurring to be injected other known reference algorithms.
into the target image. (ii) The proposed solutions use uncritically Canonical Correlation Analysis (CCA) is a dimensionality reduc-
subsampling to avoid possible impairments in the fused images tion algorithm commonly used in image processing. When extract-
due to missing cancellation of aliasing terms. The results presented ing various features from an image, each feature can form a linear
and discussed on the SPOT data [71] show its effectiveness. space, and CCA can be used to analyze the correlation between
In recent years, wavelet decomposition has become an attrac- these spaces. Based on CCA, a new feature extraction method
tive wavelet-based fusion technique to fuse multisensor images. based on feature fusion is proposed by Sun et al. [74]. It is the
Normally, the input images are decomposed using an orthogonal first time to apply CCA to feature fusion and image recognition.
wavelet to extract features, which are combined through an ap- The proposed method extracts two groups of feature vectors with
propriate fusion rules. Then the fused image is reconstructed by the same pattern firstly, then uses the correlation feature of two
applying inverse wavelet transform. A nonorthogonal (or redun- groups of feature to form effective discriminant vector for recog-
dant) wavelet is proposed in [70] as an alternative method for fea- nition. The proposed method not only is suitable for information
ture extraction. Compared to the normal orthogonal wavelet de- fusion, but also eliminates the redundant information within the
composition, the redundant wavelet decomposition is superior to features, which offers a new way of applying two groups of fea-
the standard orthogonal wavelet decomposition, regardless of the ture fusion to discrimination. Experimental results on the Concor-
fusion rules. dia University CENPARMI database of handwritten Arabic numer-
After that, a new alternative method for multispectral and als and Yale face database show that the recognition rate is much
panchromatic image fusion based on a high-pass filtering proce- higher than the use of a single feature or existing fusion algo-
dure is proposed by González-Audícana et al. [68]. Multi-resolution rithms.
wavelet decomposition is used to perform detail feature extraction.
In this paper, Intensity Hue-Saturation (IHS) and Principal Compo- 4.3. Decision-level multi-sensor data fusion
nent Analysis (PCA) are used to inject the spatial details of a full-
color image into a multispectral one. A higher quality image can Decision-level fusion is cognitive-based fusion, which is the
be obtained due to the selective inclusion of the spatial details of highest level of data fusion and the highest level of abstraction.
the full-color image, which is missed in the multispectral image. Aiming at the specific requirements of the proposed problem,
The method proposed improves the quality of synthetic images decision-level fusion utilizes several sub-decisions or features to
by standard IHS, PCA and standard wavelet based fusion methods. yield a final or higher decision.
For the two proposed fusion methods in [68], better results are Decision-level fusion includes two phases. The first phase is to
obtained when an undecimated algorithm is used to perform the make feature description for each data. The second phase fuses
multi-resolution wavelet decomposition. the results in the last phase to get the fusion feature description
for the target or environment. The advantages of the decision-level
4.2. Feature-level multi-sensor data fusion fusion include: strong fault tolerance, good openness, short pro-
cessing time, low data requirement and strong analytical ability.
Feature-level fusion is a medium level of data fusion, which Among the three fusion levels, the computational complexity of
includes three steps. The first step extracts features from all re- decision-level fusion is the smallest. However, due to the higher
mote sensing image data, the extracted features should be a suffi- requirements for feature extraction, the cost of the decision-level
cient representation of the original information or statistically suf- fusion is higher than the other two levels of fusion. In some cases,
ficient. According to the feature information, all data from different decision-level fusion has a strong dependence on the other two
sources is classified, clustered and synthesized to generate eigen- levels fusion.
vectors in the second step. In the third step, some feature-level In practice, a new method for automatic ovarian tumor classi-
fusion methods are used to fuse these generated eigenvectors to fication based on the decision level fusion is proposed by Khazen-
make a feature description based on fusion eigenvectors. dar et al. [75]. Firstly, the proposed method extracts two different
A feature-level image fusion method based on segmented re- types of features (i.e., histograms and local binary patterns) from
gions and neural networks is proposed by Wang et al. [72] to pro- ultrasound images of an ovary. Then ovarian tumor is classified ac-
cess the images from CCD (Charge-coupled Device) devices. Firstly, cording to the features of an ultrasound image using Support Vec-
a source image is divided into a set of common areas as a prepro- tor Machine (SVM). At this stage, the proposed method uses the
R. Wang et al. / Pattern Recognition Letters 109 (2018) 120–128 127

new decision-level fusion method, which classifies SVM-based de- References


cision scores into a confidence measure to help the final diagnosis
decision making. Experimental results show that such confidence- [1] X. Zhu, L. Zhang, Z. Huang, A sparse embedding and least variance encoding
approach to hashing, IEEE Trans. Image Process. 23 (9) (2014) 3737–3750.
based prediction outcomes are much closer to the reality in diag- [2] X. Zhu, Z. Huang, H. Cheng, J. Cui, H.T. Shen, Sparse hashing for fast multime-
nosis of ovarian cancers. dia search, ACM Trans. Inf. Syst. (TOIS) 31 (2) (2013) 9.
The Facial Action Coding System (FACS) is the taxonomy of hu- [3] A. Adhikari, P. Ramachandrarao, W. Pedrycz, Mining multiple large databases,
in: Developing Multi-Database Mining Applications, 2010, pp. 37–50.
man facial expressions designed to facilitate human annotation of [4] http://www.theaustralian.com.au/national-affairs/treasury/
facial expressions. The atomic facial muscle actions in this system australian- taxation- office- turns- to- facebook- to- catch- cheats/news- story/
are called Action Units (AUs). Based on a new region-based face 6d4bca2d0223dcb3924c061fa28bc3f9.
[5] S. Zhang, C. Zhang, Q. Yang, Data preparation for data mining, Appl. Artif. Intell.
representation and a mid-level region-specific decision layer, a new
17 (5–6) (2003) 375–381.
method for AU detection is proposed in [76]. The proposed method [6] S. Zhang, Z. Qin, C.X. Ling, S. Sheng, “Missing is useful”: missing values in
uses domain knowledge (i.e., expert knowledge of how FACS is de- cost-sensitive decision trees, IEEE Trans. Knowl. Data Eng. 17 (12) (2005)
1689–1693.
fined) regarding AU-specific facial muscle contractions to define
[7] R. Hu, X. Zhu, D. Cheng, W. He, Y. Yan, J. Song, S. Zhang, Graph self-represen-
a set of facial regions covering the entire face. Other approaches, tation method for unsupervised feature selection, Neurocomputing 220 (2017)
however, typically represent a face as a regular grid based on 130–137.
the locations on a face only (holistic representation), which may [8] X. Zhu, H.I. Suk, S.W. Lee, D. Shen, Subspace regularized sparse multitask
learning for multiclass neurodegenerative disease identification, IEEE Trans.
not utilize the full facial appearance. The proposed method then Biomed. Eng. 63 (3) (2016) 607–618.
uses an AU-specific weighted sum model as a decision-level fusion [9] X. Zhu, X. Li, S. Zhang, Block-row sparse multiview multilabel learning for im-
layer for region-specific probabilistic information combination. It age classification, IEEE Trans. Cybern. 46 (2) (2016) 450–461.
[10] M.W. Bright, A.R. Hurson, S.H. Pakzad, A taxonomy and current issues in mul-
allows each classifier to learn the typical appearance changes for tidatabase systems, Computer 25 (3) (1992) 50–60.
a specific face part and reduces the dimensionality of the problem. [11] S. Zhang, X. Li, M. Zong, X. Zhu, R. Wang, Efficient kNN classification with dif-
Experimental results based on the DISFA [77] and GEMEP-FERA ferent numbers of nearest neighbours, IEEE Trans. Neural Networks Learn. Syst.
(2017), doi:10.1109/TNNLS.2017.2673241.
[78] datasets show superior performance of both domain-specific [12] S. Zhang, X. Li, M. Zong, X. Zhu, D. Cheng, Learning k for knn classification,
region definition and decision-level fusion. ACM Trans. Intell. Syst. Technol. (TIST) 8 (3) (2017) 43.
More recent, a novel multi-sensor fusion framework is proposed [13] X. Zhu, X. Li, S. Zhang, C. Ju, X. Wu, Robust joint graph sparse coding for un-
supervised spectral feature selection, IEEE Trans. Neural Networks Learn. Syst.
by Kumar et al. [79] using Coupled Hidden Markov Model (CHMM)
26 (6) (2016) 1263–1275.
for Sign Language Recognition (SLR). This framework calibrates two [14] S. Zhang, Z. Jin, X. Zhu, Missing data imputation by utilizing information
different sensors (named Leap Motion [80] and Kinect [81]) with within incomplete instances, J. Syst. Softw. 84 (3) (2011) 452–459.
[15] A. Turinsky, R. Grossman, A framework for finding distributed data mining
different depth information and frames per second (fps) to form a
strategies that are intermediate between centralized strategies and in-place
multi-sensor data framework for capturing dynamic sign gestures. strategies, in: Proceedings of Workshop on Distributed and Parallel Knowledge
Leap Motion is kept below a signer’s hand to sense the movements Discovery at KDD-20 0 0, 20 0 0, pp. 1–7.
of signer’s hand and fingers in horizontal. Kinect is kept in front [16] X. Wu, C. Zhang, S. Zhang, Database classification for multi-database mining,
Inf. Syst. 30 (1) (2005) 71–88.
of the signer to acquire the movements of signer’s hand and fin- [17] S. Zhang, C. Zhang, X. Wu, Data mining and multi-database mining, in: Knowl-
gers in vertical. Experimental results prove that new CHMM fusion edge Discovery in Multiple Databases, 2004, pp. 27–61.
framework is better than the single sensor-based SLR camera sys- [18] S. Zhang, M.J. Zaki, Mining multiple data sources: local pattern analysis, Data
Mining Knowl. Discov. 12 (2) (2006) 121–125.
tem. [19] X. Wu, S. Zhang, Synthesizing high-frequency rules from different data sources,
IEEE Trans. Knowl. Data Eng. 15 (2) (2003) 353–367.
[20] S. Zhang, X. You, Z. Jin, X. Wu, Mining globally interesting patterns from
multiple databases using kernel estimation, Expert Syst. Appl. 36 (8) (2009)
5. Conclusion and future work 10863–10869.
[21] X. Yan, C. Zhang, S. Zhang, Toward databases mining: preprocessing collected
With the continuous development of data mining technology, data, Appl. Artif. Intell. 17 (5–6) (2003) 545–561.
[22] S. Zhang, Q. Chen, Q. Yang, Acquiring knowledge from inconsistent data
research in multiple data sources mining is becoming more im- sources through weighting, Data Knowl. Eng. 69 (8) (2010) 779–799.
perative and important. It has a wide range of applications in the [23] S. Zhang, M. Zong, K. Sun, Y. Liu, D. Cheng, Efficient kNN algorithm based on
fields such as robotics, automation and intelligent system design. graph sparse reconstruction, in: International Conference on Advanced Data
Mining and Applications, 2014, pp. 356–369.
This has been and will continue to be a growing interest in the
[24] X. Zhu, S. Zhang, Z. Jin, Z. Zhang, Z. Xu, Missing value estimation for mixed-at-
research community to develop more advanced data mining meth- tribute data sets, IEEE Trans. Knowl. Data Eng. 23 (1) (2011) 110–121.
ods and architectures. This paper critically reviewed many useful [25] S. Zhang, X. Wu, C. Zhang, Multi-database mining, IEEE Comput. Intell. Bull. 2
(1) (2003) 5–13.
methods to mine meaningful information and discover new knowl-
[26] Y. Chen, F. Li, J. Fan, Mining association rules in big data with NGEP, Cluster
edge from multiple data sources: (i) pattern analysis, (ii) multiple Comput. 18 (2) (2015) 577–585.
data source classification, (iii) multiple data source clustering, and [27] R. Agrawal, R. Srikant, Fast algorithms for mining association rules, in: 20th In-
(iv) multiple data source fusion. There are still several challenges ternational Conference Very Large Data Bases, VLDB, 1215, 1994, pp. 487–499.
[28] J. Han, J. Pei, Y. Yin, R. Mao, Mining frequent patterns without candidate gen-
in these three effective approaches, which need further research. eration: a frequent-pattern tree approach, Data Mining Knowl. Discov. 8 (1)
(2004) 53–87.
[29] J. Yan, N. Liu, Q. Yang, B. Zhang, Q. Cheng, Z. Chen, Mining adaptive ratio rules
from distributed data sources, Data Mining Knowl. Discov. 12 (2–3) (2006)
Acknowledgments 249–273.
[30] Z. Abdullah, T. Herawan, N. Ahmad, M.M. Deris, Mining significant association
This work is in part supported by the Marsden Fund, rules from educational data using critical relative support approach, Proc. Soc.
Behav. Sci. 28 (2011) 97–101.
New Zealand, the National Natural Science Foundation of China [31] J. Sahoo, A.K. Das, A. Goswami, An efficient approach for mining association
(U1435214, U1609215, 51507084, 61363037), the Natural Sci- rules from high utility itemsets, Expert Syst. Appl. 42 (13) (2015) 5754–5778.
ence Foundation of Zhejiang Province, China (LY18F020 0 08), the [32] R. Agrawal, R. Srikant, Mining sequential patterns, in: Proceedings of the
Eleventh International Conference on Data Engineering, 1995, pp. 3–14.
Youth Innovation Research Team of Sichuan Province, China
[33] J. Han, J. Pei, B. Mortazavi-Asl, H. Pinto, Q. Chen, U. Dayal, M.C. Hsu, Prefixspan:
(2015TD0020), CDUT (KYTD201404), the Primary R&D Plan of mining sequential patterns efficiently by prefix-projected pattern growth, in:
Jiangsu Province, China (BE2015213), the China Postdoctoral Sci- Proceedings of the 17th International Conference on Data Engineering, 2001,
pp. 215–224.
ence Foundation (2016M591890), the National Social Science Foun-
[34] H.C. Kum, J.H. Chang, W. Wang, Sequential pattern mining in multi-databases
dation of China (17BJY033), and the Chinese Scholarship Council via multiple alignment, Data Mining Knowl. Discov. 12 (2-3) (2006) 151–180.
PhD scholarships.
128 R. Wang et al. / Pattern Recognition Letters 109 (2018) 120–128

[35] P. Fournier-Viger, A. Gomariz, M. Campos, R. Thomas, Fast vertical mining of [59] B. Khaleghi, A. Khamis, F.O. Karray, S.N. Razavi, Multisensor data fusion: a re-
sequential patterns using co-occurrence information, in: Pacific-Asia Confer- view of the state-of-the-art, Inf. Fusion 14 (1) (2013) 28–44.
ence on Knowledge Discovery and Data Mining, 2014, pp. 40–52. [60] R.C. Luo, M.H. Lin, R.S. Scherp, Dynamic multi-sensor data fusion system for
[36] M.J. Zaki, SPADE: an efficient algorithm for mining frequent sequences, Mach. intelligent robots, IEEE J. Robot. Autom. 4 (4) (1988) 386–396.
Learn. 42 (1) (2001) 31–60. [61] Y. Zhang, S. Prasad, Locality preserving composite kernel feature extraction for
[37] J. Ayres, J. Flannick, J. Gehrke, T. Yiu, Sequential pattern mining using a bitmap multi-source geospatial image analysis, IEEE J. Selected Top. Appl. Earth Ob-
representation, in: Proceedings of the Eighth ACM SIGKDD International Con- serv. Remote Sens. 8 (3) (2015) 1385–1392.
ference on Knowledge Discovery and Data Mining, 2002, pp. 429–435. [62] R.C. Luo, M.G. Kay, Multisensor integration and fusion in intelligent systems,
[38] A. Gomariz, M. Campos, R. Marin, B. Goethals, ClaSP: an efficient algorithm for IEEE Trans. Syst. Man Cybern. 19 (5) (1989) 901–931.
mining frequent closed sequences, in: Pacific-Asia Conference on Knowledge [63] M.P. Đurišić, Z. Tafa, G. Dimić, V. Milutinović, A survey of military applications
Discovery and Data Mining, 2013, pp. 50–61. of wireless sensor networks, in: 2012 Mediterranean Conference on Embedded
[39] Y. Wu, L. Wang, J. Ren, W. Ding, X. Wu, Mining sequential patterns with peri- Computing (MECO), 2012, pp. 196–199.
odic wildcard gaps, Appl. Intell. 41 (1) (2014) 99–116. [64] Y. Bar-Shalom, P.K. Willett, X. Tian, Tracking and Data Fusion, YBS Publishing,
[40] X. Yan, J. Han, gspan: graph-based substructure pattern mining, in: Proceed- 2011.
ings of IEEE International Conference on Data Mining, 2002, pp. 721–724. [65] J. Llinas, D.L. Hall, An introduction to multi-sensor data fusion, in: Proceed-
[41] R. Suganthi, P. Kamalakannan, Exceptional patterns with clustering items in ings of the 1998 IEEE International Symposium on Circuits and Systems, 1998.
multiple databases, Indian J. Sci. Technol. 8 (31) (2015) 1–10. ISCAS’98, 6, 1998, pp. 537–540.
[42] C. Zhang, M. Liu, W. Nie, S. Zhang, Identifying global exceptional patterns in [66] A. Dang, X. Wang, X. Chen, in: ERDAS IMAGINE Remote Sensing Image Pro-
multi-database mining, IEEE Intell. Inf. Bull. 3 (1) (2004) 19–24. cessing Method, Tsinghua University Press, 2003, pp. 1–6.
[43] T. Ramkumar, R. Srinivasan, Modified algorithms for synthesizing high-fre- [67] G. Wei, S. Yang, Multisource RS data fusion based on principal component
quency rules from different data sources, Knowl. Inf. Syst. 17 (3) (2008) analysis, J. Lanzhou Jiaotong Univ. (Nat. Sci.) 24 (3) (2005) 57–60.
313–334. [68] M. González-Audícana, J.L. Saleta, R.G. Catalán, R. García, Fusion of multispec-
[44] S. Khiat, H. Belbachir, S.A. Rahal, Probabilistic models for local patterns analy- tral and panchromatic images using improved IHS and PCA mergers based
sis, University of Sciences and Technology of Oran, 2015. on wavelet decomposition, IEEE Trans. Geosci. Remote Sens. 42 (6) (2004)
[45] A. Adhikari, P.R. Rao, Synthesizing heavy association rules from different real 1291–1299.
data sources, Pattern Recognit. Lett. 29 (1) (2008) 59–71. [69] B. Aiazzi, L. Alparone, S. Baronti, A. Garzelli, Context-driven fusion of high spa-
[46] D.L. Donoho, Compressed sensing, IEEE Trans. Inf. Theory 52 (4) (2006) tial and spectral resolution images based on oversampled multiresolution anal-
1289–1306. ysis, IEEE Trans. Geosci. Remote Sens. 40 (10) (2002) 2300–2312.
[47] B. Yang, S. Li, Multifocus image fusion and restoration with sparse representa- [70] Y. Chibani, A. Houacine, Redundant versus orthogonal wavelet decomposition
tion, IEEE Trans. Instrum. Meas. 59 (4) (2010) 884–892. for multisensor image fusion, Pattern Recognit. 36 (4) (2003) 879–887.
[48] J. Tao, S. Wen, W. Hu, Multi-source adaptation learning with global and local [71] P. Terretaz, Comparison of different methods to merge SPOT P and XS data:
regularization by exploiting joint kernel sparse representation, Knowl. Based evaluation in an urban area, in: Proceedings of the 17th Symposium of EARSeL,
Syst. 98 (2016) 76–94. Future Trends in Remote Sensing, Lyngby, Denmark, 1997, pp. 17–20.
[49] R. Porter, V. Tadic, A. Achim, Sparse Bayesian learning for non-Gaussian [72] R. Wang, F. Bu, H. Jin, L. Li, A feature-level image fusion algorithm based
sources, Digit. Signal Process. 45 (2015) 2–12. on neural networks, in: Proceedings of the First International Conference on
[50] A. Adhikari, P. Ramachandrarao, B. Prasad, J. Adhikari, Mining multiple large Bioinformatics and Biomedical Engineering, 2007, pp. 821–824.
data sources, Int. Arab J. Inf. Technol. 7 (3) (2010) 241–249. [73] N. Kaempchen, M. Buehler, K. Dietmayer, Feature-level fusion for free-form ob-
[51] A. Adhikari, L.C. Jain, S. Ramanna, Analysing effect of database grouping on ject tracking using laserscanner and video, in: Proceedings IEEE Intelligent Ve-
multi-database mining, IEEE Intell. Inf. Bull. 12 (1) (2011) 25–32. hicles Symposium, 2005, pp. 453–458.
[52] X. Zhu, Q. Xie, Y. Zhu, X. Liu, S. Zhang, Multi-view multi-sparsity kernel re- [74] Q.S. Sun, S.G. Zeng, Y. Liu, P.A. Heng, D.S. Xia, A new method of feature fu-
construction for multi-class image classification, Neurocomputing 169 (2015) sion and its application in image recognition, Pattern Recognit. 38 (12) (2005)
43–49. 2437–2448.
[53] X. Wu, C. Zhang, S. Zhang, Database classification for multi-database mining, [75] S. Khazendar, H. Al-Assam, et al., Automated classification of static ultrasound
Inf. Syst. 30 (1) (2005) 71–88. images of ovarian tumours based on decision level fusion, in: Computer Sci-
[54] H. Li, X. Hu, Y. Zhang, An improved database classification algorithm for mul- ence and Electronic Engineering Conference (CEEC), 2014, pp. 148–153.
ti-database mining, in: International Workshop on Frontiers in Algorithmics, [76] B. Jiang, B. Martinez, M.F. Valstar, M. Pantic, Decision level fusion of domain
2009, pp. 346–357. specific regions for facial action recognition, in: 2014 22nd International Con-
[55] Y.U.A.N. Dingrong, F.U. Huiwen, L.I. Zhixin, W.U. Hongyi, An application-inde- ference on Pattern Recognition (ICPR), 2014, pp. 1776–1781.
pendent database classification method based on high cohesion and low cou- [77] S.M. Mavadati, M.H. Mahoor, K. Bartlett, P. Trinh, Automatic detection of non–
pling, J. Inf. Comput. Sci. 9 (15) (2012) 4337–4344. posed facial action units, in: 2012 19th IEEE International Conference on Image
[56] H. Tang, M. Zhang, A simple methodology for database clustering, in: Proceed- Processing (ICIP), 2012, pp. 1817–1820.
ings of Science (CENet), 2015, pp. 1–7. [78] M.F. Valstar, M. Mehu, B. Jiang, M. Pantic, K. Scherer, Meta-analysis of the first
[57] G. Xin, S. Wang, M. Xu, A new cross-multidomain classification algorithm and facial expression recognition challenge, IEEE Trans. Syst. Man Cybern. Part B
its fast version for large datasets, Acta Autom. Sin. 40 (3) (2014) 531–547. (Cybern.) 42 (4) (2012) 966–979.
[58] C.J. Hsieh, K.W. Chang, C.J. Lin, S.S. Keerthi, S. Sundararajan, A dual coordinate [79] P. Kumar, H. Gauba, P.P. Roy, D.P. Dogra, Coupled hmm-based multi-sensor data
descent method for large-scale linear SVM, in: Proceedings of the 25th Inter- fusion for sign language recognition, Pattern Recognit. Lett. 86 (2017) 1–8.
national Conference on Machine Learning, 2008, pp. 408–415. [80] https://www.leapmotion.com/.
[81] https://developer.microsoft.com/en-us/windows/kinect.

You might also like