Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

 UNDER THE GUIDANCE OF DR.

AMIT PAUL
BY TAMAGHNO CHAUDHURI, CSE, 3RD YEAR

DATA MINING &


CLASSIFIERS
UNDERSTANDING CLASSIFIERS 
THE PROBLEM WE ARE
TRYING TO RESOLVE
THE IDEA IS TO IMPROVE BETTER UNDERSTANDING AND CATEGORICAL SEGMENTATION
OF BIG DATA AND ITS CLASSIFICATION BASED ON THE CONCEPT OF DATA MINING
CLASSIFICATION TECHNIQUE.

USING THE CONCEPT OF CLASSIFIERS , DECISION TREES & MACHINE LEARNING


ALGORITHMS WITH A TOUCH OF DEEP LEARNING TECHNIQUES.

MORE ACCURATE RECOGNITION OF OBJECTS AND IMPROVED IDEA OF OBJECT


RECOGNITION.

Bijou Solutions, Inc. | 2020


WHAT IS DATA SET ?
A data set is a collection of related data, collected from a single source
having several applications.

In the computer and Internet arena, a data set is a group of numbers, or bytes, often
displayed in a table with the columns categorizing the data into subsets.

TYPES OF DATA SETS :

1.Sequential
2.Partitioned
3.Visual Storage Access Method (VSAM)

Data sets are organized according to their quantity and the frequency and method by
which they will be accessed. The format of the individual data sets also depends on
the intended use of the information Bijou Solutions, Inc. | 2020
WHAT
IS
DATA MINING ??
KNOWLEDGE DISCOVERY FROM DATA (KDD)

We live in a world where vast amounts of data are collected daily.


Data mining provides a way of discovering and studying knowledge from that data.

Natural evolution of Information Technology. We're are living in the data age.

Data mining turns a large collection of data into knowledge, based on which a
machine can recognize a given data set based on predefined data analytics.

Alternate terms to Data Mining :


Knowledge mining from data, knowledge extraction, data/pattern analysis, data
archaeology, and data dredging. Bijou Solutions, Inc. | 2020
DATA MINING
FUNCTIONALITIES 
CONCEPTS or CLASS DESCRIPTION

ASSOCIATION

CLUSTERING

CLASSIFICATION

Bijou Solutions, Inc. | 2020


CLASSIFICATION
Classification is a data mining task/functionality of predicting the value of a
categorical variable (target/class).

Working on a data set, the data analysis task is classification, where a model
or classifier is constructed to predict class (categorical) labels based on one
or more numerical and/or categorical variables (predictors, attributes,
features).

Data classification is a two-step process, consisting of a learning step and a


classification step.
THE Where a classification model is constructed based on
our understanding of data.

LEARNING
STEP

Bijou Solutions, Inc. | 2020


THE Where the pre-constructed model from the learning
step is used to predict class labels for given data.
CLASSIFICATION
STEP

Bijou Solutions, Inc. | 2020


THE DECISION A Decision tree is a flowchart like tree structure:
TREE Each internal node denotes a test on an attribute.

Each branch represents an outcome of the test. 

Each leaf node (terminal node) holds a class label

Bijou Solutions, Inc. | 2020


How decision tree works : (a simple diagram)
Decision Tree Algorithm Pseudocode:

Place the best attribute of the data-set at the root of the tree.

Split the training set into subsets.


Subsets should be made in such a way that each subset
contains data with the same value for an attribute.

Repeat step 1 and step 2 on each subset until you find leaf
nodes in all the branches of the tree.

Bijou Solutions, Inc. | 2020


The information table shall be given besides a
well-equipped machine and the primary
objective is to solve / categorize the problem.

There are two stages involved :


1. Training (Object feature value as class level)
2. Testing (based on the given trained data set
instructions)

For example:
We're actually studying a maths example sum
with the formulae in hand and then solving an
exercise problem.
Based on class level , the decision is taken,
the formula for this is derived from the
information table.

TRAINING

How: 
While the decision is taken, the decision tree
is constructed. Every node/level is one
decision.
How will we attend
our objective?
As soon as the training data-set changes,
the model changes, but the algorithm never
changes.

Bijou Solutions, Inc. | 2020


TESTING To check the accuracy of the model, the specifications,
and to get required information about the model

ACCURACY SPECIFICATIONS SENSITIVITY FALSE ERROR


POSITIVE
RATE

THESE ARE CONCLUDED (EXTRACTED) FROM THE CONCLUSION MATRIX.

For instance : Fort wo-class classifiers we use a one-of-a-kind matrix :


analogous to black OR white, positive OR negative.

You might also like