Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 8

Assignment 5 (mini project) R(2) C(4) V(2) T(2) Total(10) Sign

5.1 Title: - Mini project on classification: Consider a labeled dataset belonging to


an application domain. Apply suitable data preprocessing steps such as handling
of null values, data reduction, and discretization. For prediction of class labels of
given data instances, build classifier models using different techniques (minimum
3), analyze the confusion matrix and compare these models. Also apply cross
validation while preparing the training and testing datasets. 

5.2 Software Requirements: Rapid Miner


5.3 Hardware Requirement: PIV, 2GB RAM, 500 GB HDD, Lenovo A13-
4089Model.
Theory:-
Project view

Figure 5.1 Project View


5.4 Project Output:-
1)Database output

5.2 Database
2) Pie Diagram: On Name
5.3 Pie Diagram

3) Bars : On sex

Figure 5.4 Chart


Operations perform on Project

1) Simple Distribution (Naive Bays):-


It shows label distribution.
It represents the density of probability of the TF-IDF of the words

5.5 SimpleDistribution Report

2) Decision Tree :- This Operator generates a decision tree model, which can be
used for classification and regression.
Figure 5.6 Decision tree(using 4 attributes)

Figure 5.7 Decision tree using 2 attributes


3) Join:- This Operator joins two Example Sets using one or more Attributes of the
input Example Sets as key attributes.

Figure 5.8 Join

4)Set Role :- This Operator is used to change the role of one or more Attributes.
Figure 5.9 set Role

5) Discretizes:-This operator discretizes the selected numerical attributes into


user-specified number of bins. Bins of equal range are automatically generated;
the number of the values in different bins may vary.

6)K-NN:-This Operator generates a k-Nearest Neighbor model, which is used for


classification or regression.
Figure 5.10 KNN

7) Replace: - This operator replaces parts of the values of selected nominal


attributes matching a specified regular expression by a specified replacement.

5.5 Conclusion: Using Rapid miner tool we perform ETL operations on Titanic
data set and done the analysis of data and generated different results such as Pie
chart,Statistics,Decision tree ,etc.

You might also like