2011 Data Mining

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 5

DATA MINING 2011 IEEE Specifications

1) A Dual Framework and Algorithms for Targeted Online Data Delivery

Roitman, H. Gal, A. Raschid, L.

Page(s): 5 - 21

Digital Object Identifier : 10.1109/TKDE.2010.15

Abstract

.A variety of emerging online data delivery applications challenge existing techniques for data delivery
to human users, applications, or middleware that are accessing data from multiple autonomous servers.
In this paper, we develop a framework for formalizing and comparing pull-based solutions and present
dual optimization approaches. The first approach, most commonly used nowadays, maximizes user
utility under the strict setting of meeting a priori constraints on the usage of system resources. ... »

2) Classification Using Streaming Random Forests

Abdulsalam, Hanady Skillicorn, David B. Martin, Patrick

Page(s): 22 - 36

Abstract

.We consider the problem of data stream classification, where the data arrive in a conceptually infinite
stream, and the opportunity to examine each record is brief. We introduce a stream classification
algorithm that is online, running in amortized {cal O}(1) time, able to handle intermittent arrival of
labeled records, and able to adjust its parameters to respond to changing class boundaries (“concept
drift”) in the data stream. In addition, when blocks of labeled data are short... »

3) Coupling Logical Analysis of Data and Shadow Clustering for Partially Defined Positive Boolean
Function Reconstruction

Muselli, Marco Ferrari, Enrico

Page(s): 37 - 50

Digital Object Identifier : 10.1109/TKDE.2009.206

Abstract | Full Text: PDF (2393KB)

.The problem of reconstructing the and-or expression of a partially defined positive Boolean function
(pdpBf) is solved by adopting a novel algorithm, denoted by LSC, which combines the advantages of two
efficient techniques, Logical Analysis of Data (LAD) and Shadow Clustering (SC). The kernel of the
approach followed by LAD consists in a breadth-first enumeration of all the prime implicants whose
degree is not greater than a fixed maximum d. In contrast, SC adopts an effective heuristic procedu... »

4) Data Leakage Detection


Papadimitriou, P. Garcia-Molina, H.

Page(s): 51 - 63

Digital Object Identifier : 10.1109/TKDE.2010.100

Abstract | Full Text: PDF (957KB)

.We study the following problem: A data distributor has given sensitive data to a set of supposedly
trusted agents (third parties). Some of the data are leaked and found in an unauthorized place (e.g., on
the web or somebody's laptop). The distributor must assess the likelihood that the leaked data came
from one or more agents, as opposed to having been independently gathered by other means. We
propose data allocation strategies (across the agents) that improve the probability of identifying leak...
»

5) Decision Trees for Uncertain Data

Tsang, Smith Kao, Ben Yip, Kevin Y. Ho, Wai-Shing Lee, Sau Dan

Page(s): 64 - 78

Digital Object Identifier : 10.1109/TKDE.2009.175

Abstract | Full Text: PDF (2037KB)

.Traditional decision tree classifiers work with data whose values are known and precise. We extend
such classifiers to handle data with uncertain information. Value uncertainty arises in many applications
during the data collection process. Example sources of uncertainty include measurement/quantization
errors, data staleness, and multiple repeated measurements. With uncertainty, the value of a data item
is often represented not by one single value, but by multiple values forming a probability d... »

6) Efficient Periodicity Mining in Time Series Databases Using Suffix Trees

Rasheed, F. Alshalalfa, M. Alhajj, R.

Page(s): 79 - 94

Digital Object Identifier : 10.1109/TKDE.2010.76

Abstract | Full Text: PDF (2203KB)


.Periodic pattern mining or periodicity detection has a number of applications, such as prediction,
forecasting, detection of unusual activities, etc. The problem is not trivial because the data to be
analyzed are mostly noisy and different periodicity types (namely symbol, sequence, and segment) are
to be investigated. Accordingly, we argue that there is a need for a comprehensive approach capable of
analyzing the whole time series or in a subsection of it to effectively handle different types o... »

7) Exploring Application-Level Semantics for Data Compression

Hsiao-Ping Tsai De-Nian Yang Ming-Syan Chen

Page(s): 95 - 109

Digital Object Identifier : 10.1109/TKDE.2010.30

Abstract | Full Text: PDF (2405KB)

Natural phenomena show that many creatures form large social groups and move in regular patterns.
However, previous works focus on finding the movement patterns of each single object or all objects. In
this paper, we first propose an efficient distributed mining algorithm to jointly identify a group of
moving objects and discover their movement patterns in wireless sensor networks. Afterward, we
propose a compression algorithm, called 2P2D, which exploits the obtained group movement patterns
to

8) Missing Value Estimation for Mixed-Attribute Data Sets

Zhu, Xiaofeng Zhang, Shichao Jin, Zhi Zhang, Zili Xu, Zhuoming

Page(s): 110 - 121

Digital Object Identifier : 10.1109/TKDE.2010.99

Abstract | Full Text: PDF (3235KB)

.Missing data imputation is a key issue in learning from incomplete data. Various techniques have been
developed with great successes on dealing with missing values in data sets with homogeneous attributes
(their independent attributes are all either continuous or discrete). This paper studies a new setting of
missing data imputation, i.e., imputing missing data in data sets with heterogeneous attributes (their
independent attributes are of different types), referred to as imputing mixed-attribut... »

9) Privacy-Preserving OLAP: An Information-Theoretic Approach


Nan Zhang Wei Zhao

Page(s): 122 - 138

Digital Object Identifier : 10.1109/TKDE.2010.25

Abstract

We address issues related to the protection of private information in Online Analytical Processing
(OLAP) systems, where a major privacy concern is the adversarial inference of private information
from OLAP query answers. Most previous work on privacy-preserving OLAP focuses on a single
aggregate function and/or addresses only exact disclosure, which eliminates from consideration an
important class of privacy breaches where partial information, but not exact values, of private data is
disclosed ...

You might also like