Professional Documents
Culture Documents
A Survey On Incremental Feature Extraction: D2 Makoto Miwa Chikayama & Taura Lab
A Survey On Incremental Feature Extraction: D2 Makoto Miwa Chikayama & Taura Lab
A Survey On Incremental Feature Extraction: D2 Makoto Miwa Chikayama & Taura Lab
Incremental Feature
Extraction
D2 Makoto Miwa
Chikayama & Taura Lab.
Table of Contents
Introduction
Dimension Reduction
Feature Extraction
Feature Selection
Incremental Feature Extraction
Discussion
Summary
Introduction
Large and high-dimensional data
Web documents, etc…
A large amount of resources are needed in
Information Retrieval
Classification tasks
Data Preservation etc…
Dimension Reduction
Dimension Reduction
Weight (kg) overweight
underweight
60
50
Dimension Reduction
preserves information on classification of overweight and
underweight as much as possible
makes classification easier
reduces data size ( 2 features 1 feature )
Dimension Reduction
Weight (kg) weight / height
Feature Extraction (FE)
Generates feature
ex.
Preserves weight / height
Height (cm)
Feature Selection (FS) Weight (kg) Weight (kg)
Selects feature
ex.
Preserves weight
Height (cm)
Problem Setting
classes samples
4 datasets
(2 face sets & 2 documents)
Feature Selection
Selects features in some criterion
Information Gain (IG)
Chi-square value (CHI)
Orthogonal Centroid Feature Selection (OCFS)
Feature Selection by CHI
Chi-square value
represents the strength of the correlation between
classes and a feature
prior probability of
Feature Selection by CHI
Feature selection by CHI
Select top features in chi-square value
The main computational time is spent on the
calculation of
time complexity :
space complexity:
Feature Selection (Accuracy)
789,670 documents
Feature Selection (Time)
FE vs FS
time complexity space complexity
PCA
LDA
OC
CHI
In most cases
FE vs FS
FE needs matrix computation
High computational and spatial Cost
FE finds an optimal solution
FS doesn’t need matrix computation
Fast
FS can treat very high dimensional data
FS finds a nearly optimal solution
FE vs FS
Table of Contents
Introduction
Dimension Reduction
Feature Extraction
Feature Selection
Incremental Feature Extraction
Summary
Incremental Feature Extraction
Feature extraction can’t treat large data at once
Data aren’t always presented at once
Some data may arrive later
: Lagrange multipliers
: th estimated eigenvector
(column of projection matrix)
Incremental Orthogonal Centro
id (3) [Yan et al. 2006]
IOC directly updates eigenvectors
(=projection matrix ) incrementally
Fast, although not exactly so in early stages.
time complexity (update )
space complexity
OC
time complexity :
space complexity :
In most cases
Incremental OC
IPCA IMMC IOC CHI OCFS