Aima Data Mining

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 13

Data Mining System

A typical data-mining system consists


of
--a data-mining engine
--a repository that persists the data-
mining artifacts, such as the models,
created in the process.
The actual data is obtained via a
database connection, or via a file-
system API.
Building a data-mining model
1. Decide what you want to learn.
2. Select and prepare your data.
3. Choose mining tasks and configure
the mining algorithms.
4. Build your data-mining model.
5. Test and refine the models.
6. Report findings or predict future
outcomes.
Data Mining Process

Figure 2. Data mining


steps.
Using data model and
results
Once you've created a model, you
can test that model, and then even
apply the model to additional data.
Building, testing, and applying the
model to additional data is an
iterative process that, ideally, yields
increasingly accurate models.
Those models can then be saved in
the MOR, and used to either explain
data, or to predict the outcome of
new data in relation to your data-
mining objective.
Data Mining
Knowledge-Discovery in Databases
(KDD)
Searching large volumes of data for
patterns.
The nontrivial extraction of implicit,
previously known, and potentially
useful information from data.
The science of extracting useful
information from large data sets or
databases.
Uses computational techniques from
statistics, machine learning, and
pattern recognition.
Descriptive Statistics
Collect data
Classify data
Summarize data
present data
Make inferences to draw a
conclusions
--Point and interval estimation
--Hypothesis testing
--Prediction
Machine Learning
Concerned with the development of
techniques which allow computers
to "learn".
Concerned with the algorithmic
complexity of computational
implementations.
Many inference problems turn out
to be NP-hard or harder .
Common Machine Learning
Algorithm
Supervised learningprior
knowledge
Unsupervised learning
statistical regularity of the
patterns
Semi-supervised learning
Reinforcement learning
Transduction
Learning to learn
Pattern Recognition
The act of taking in raw data and taking
an action based on the category of the
data.
Aims to classify data patterns based on
prior knowledge or on statistical info.
Based on availability of training set:
supervised and unsupervised leanings
Two approaches: statistical (decision
theory) and syntactic (structural).
Supervised
Techniques
Classification:
-- k-Nearest Neighbors
--Nave Bayes
--Classification Trees
--Descriminant Analysis
--Logistic Regression
--Neural Nets
Supervised Techniques
Prediction (Estimation):
--Regression
--Regression Trees
--k-Nearest Neighbors
Unsupervised
Techniques
Cluster Analysis
Principle Components
Association Rules
Collaborative Filtering

You might also like