Implementation of The Data in Rapidminer

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 4

Analys And Discussion

Dataset
This data in this study uses data from Kaggle entitled Book Data. This data
consists of 460 pieces, with data on book titles, publisher names and publisher years.
This data has attributes or features that describe each example.

Fig 4.1 Data Book

The following is a list of attributes in this study and their explanation:


1.book titles
2.publisher names
3.publisher years (numeric from 2007 to 2020)

Implementation of the data in RapidMiner

To implement Classification in Rapid Miner, the first thing to do is import data.


The data that must be imported is the training data used in Excel and the test data
which contains all the data from the database, namely 460 data.

When importing data, note that each table column is assigned the correct type
attribute. If given the wrong attributes, the end result will also be wrong.

In pattern recognition, information retrieval and classification, precision is the


share of relevant samples among samples taken, while memory is the share of the
total number of relevant samples actually retrieved. Both precision and recall are
based on understanding and measuring the relevance of using a cross validation
system to check accurately and also check for low probability.
Fig 4.2 Cross Validation Book data

There are several types of attributes that we will use, namely :

a) Binominal: data type used for data types that only have two types, such as
YES / NO, or 0/1.
b) Polynominal: the type of data used for data types that have more than 2 types.
c) REAL: used for data types that have a decimal in the number.
d) Integer: used for data types that use numbers for data, without any decimals.

Fig 4.3 Attributes Data Book

For each data, the following are the types of data used for training data and testing
data :
1.Book Titles : Polynominal
2.Publisher Names : Polynominal
3.Publisher Years : Integer

At the data book processing stage, the accuracy of each publisher, the name of the
book, and the year of publication will be shown. where cross validation will show
accurate results from the titles of books that are most interested in to those that are
least in demand from year to year

Modeling is a stage that directly involves data mining techniques, namely by


selecting data mining techniques and determining the micro accurate.
Fig 4.4 Performance Data Book Cross Validation

Modeling is a stage that directly involves data mining techniques, using the decision
tree technique to describe the book title from the year it was published, the name of
the book and also where the book was published.

Fig 4.5 The Form Decision Tree Data Book

The decision tree will also specify the existing book data by describing how many
books have been published from a publisher according to the book title listed.

Fig 4.6 Spesification Decision Tree


Process implementation using Rapid Miner
Open Rapid Miner then in the repository table create a new repository then enter
the data book. Select the retrieval operator then enter the book data into the retrieval
operator in the fast miner design process area, then enter the decision tree operator
into the field, connect the two operators, then enter the applied model and the
performance operator into the design field and then enter the performance dataset
back into the data testing then connect with the operator model apply, as shown in
figure 4.7.

Fig 4.7 Rapid Miner Operators

You might also like