Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 23

DECISION TREE AND RANDOM FOREST 1

DECISION TREE AND RANDOM FORESTAND CLASSIFYING DATA SETS

[Author Name(s), First M. Last, Omit Titles and Degrees]

[Institutional Affiliation(s)]

Author Note

[Include any grant/funding information and a complete correspondence address.]


DECISION TREE AND RANDOM FOREST 2

Abstract

In this research, we initially give a brief background and then introduction to Decision

Tree and Random Forest, further discussing their advantages and disadvantages. We also

examine the differences between the two. We evaluate the classification outcomes of Decision

Tree and Random Forest, for the classification of twenty diverse datasets. We used 20 data sets

with 148 to 20,000 instances each that were accessible in the UCI repository [1]. We contrasted

the classification outcomes produced by the Random Forest and J48 Decision Tree approaches.

We spoke about the advantages and disadvantages of applying these models to big and small

data sets. The classification results demonstrate that Decision tree performs well with small data

sets, but Random Forest performs better with the same amount of characteristics and big data

sets, i.e. with more occurrences. The data set also shows that when number of instances

increased the percentage of correctly classified instances increased for Random Forest.

Keywords: Decision Tree, Random Forest, Classification, Bagging


DECISION TREE AND RANDOM FOREST 3

DECISION TREE AND RANDOM FOREST

Background

Decision trees date back to the early days of the emergence of written records. This narrative

exemplifies one of the main advantages of trees: highly interpretable outcomes with a simple,

tree-like presentation, which in turn improves comprehension and the distribution of data.

Decision trees, also known as classification trees or regression trees, have their computational

roots in models of biological and mental processes. The complementary growth of statistical

decision trees and machine learning trees is driven by their common heritage.

Ho (1995) proposed a technique to overcome a problems and challenges faced on the complexity

of decision tree classifiers created using traditional means. Such classifiers are limited in their

ability to increase in complexity without compromising their generalization ability accurately to

new input. The suggested approach makes use of oblique decision trees, which are useful for

enhancing training set accuracy. The method's main step is to construct numerous trees in feature

space subspaces that have been randomly chosen. The classification of the two trees together can

be monotonically enhanced since the trees generalize their classification in complimentary ways.

In 1997, Amit and Geman introduced a method for form identification based on the cooperative

induction of shape features and tree classifiers. They came to the conclusion that no classifier

based on the complete feature set could be assessed since it was impossible to know a

beforehand which characteristics were relevant due to the almost unlimited number of features.

Standard decision tree building based on a fixed length feature vector was not possible due to the

quantity and kind of features. An alternate strategy would be to create numerous trees while

considering a tiny random sample of attributes at each node, constraining their complexity to rise
DECISION TREE AND RANDOM FOREST 4

with tree depth. Terminal nodes have estimates of the associated posterior distribution across

shape classes. The image may be classified by distributing it downward and aggregating the

output.

In another study, Ho (1998) [2] offered a solution to the conflict between over fitting and

achieving maximum accuracy. This was accomplished by building a decision-tree-based

classifier that maintained the maximum accuracy on training data while simultaneously

increasing generalization accuracy as the classifier's complexity increased. The classifier was

made up of a number of trees that were built in a systematic manner by pseudo-randomly

choosing subsets of the feature vector's components, or trees built in randomly selected

subspaces. The subspace approach demonstrated its superiority when compared to single-tree

classifiers and other forest construction techniques when empirically evaluated against publically

accessible data sets. Then he proposes the Random Forest ensemble method, which integrates

already-existing approaches to create a set of decision trees with carefully controlled variation.

Random Forest is an ensemble learning technique for classification and regression. In order to

create a collection of decision trees with controlled variation, Breiman (2001) [3] developed a

technique that combines his bagging sampling methodology ((1996a)) with the randomly chosen

characteristics provided independently by Ho (1995); Ho (1998); and Amit and Geman (1997).

Each decision tree in the ensemble is created via bagging, using a sample with replacement taken

from the training dataset. According to statistics, the sample is expected to include roughly 64%

of instances at least once. The other cases (about 36%) are referred to as out-of-bag instances,

whereas the examples in the sample are known as in-bag instances. To identify the class label of

an unlabeled instance, each tree in the ensemble serves as a base classifier. By using majority

voting, which assigns one vote to each classifier's projected class label, the instance is classified
DECISION TREE AND RANDOM FOREST 5

according to the class label that has received the most votes [17], this is briefly discussed in

introduction.

Introduction

There are several domains where the Decision Tree method [4] is used. They are employed in a

variety of applications, including statistical data comparison, text classification, and text

extraction. In addition, libraries can use the Decision Tree method to classify books into several

groups according to their type. It may be used in hospitals to diagnose disorders including

tumors, cancer, heart issues, hepatitis, etc. It is used by businesses, hospitals, schools, colleges,

and universities to keep track of their records. It can also be used for statistics in the stock

market.

Decision Tree algorithms are efficient [5] because they offer classification rules that are

understandable to humans. In addition to this, it has certain flaws, one of which is the sorting of

all numerical properties when the tree decides to break a node. Such a split on sorting all

numerical characteristics becomes expensive, i.e., in terms of efficiency or running time and

memory space, especially if Decision Trees are set on data that is vast in size, i.e., it contains

more instances. Breiman [3] introduced the concept of random forests in 2001. Random forests

outperform existing classifiers such as support vector machines, neural networks, and

discriminant analysis while also solving the over fitting issue.

Methods that employ an ensemble of different classifiers and randomization to provide variety,

such as bagging or random subspaces [6, 7], have shown to be particularly effective. They

employ randomization throughout the induction phase to provide diversity and create classifiers

that are different from one another. Because of its effectiveness in discriminative classification,

Random Forests have drawn a lot of interest in machine learning [8].


DECISION TREE AND RANDOM FOREST 6

Lepetit et al. [9, 10] introduced Random Forests to the world of computer vision community.

His work in this area served as the basis for studies using Random Forests in areas including

class recognition [11, 12], bi-layer video segmentation [13], picture classification [14], and

person identification [15]. The Random Forest also naturally supports a wide range of visual

signals, such as color, form, texture, and depth. Random Forests are thought of as efficient

general-purpose vision tools.

According to the definition given in [3], Random Forest is a general concept of classifier

combination that makes use of L tree-structured base classifiers {h(X, Ѳn), N=1, 2, 3, L}, where

X stands for the input data and {Ѳn} is a group of identical and dependent distributed random

vectors. Data from the available data are randomly chosen for each Decision Tree. For instance,

by randomly picking a feature subset or a subset of training data for each Decision Tree, a

Random Forest (as in Random Subspaces) for each Decision Tree may be created (the concept of

Bagging). The features of a Random Forest are chosen at random for each decision split. By

picking features at random, the correlation across trees is decreased, increasing the accuracy of

predictions and resulting in higher efficiency. .

In addition to maintaining the benefits of Decision Trees, Random Forest frequently outperforms

Decision Trees thanks to its usage of random subsets of variables, bagging on samples as

previously mentioned, voting system [17], and decision-making process. The Random Forest can

accommodate missing values and can handle continuous, categorical, and binary data, making it

suitable for high dimensional data modelling. There is no need to trim the trees because Random

Forest is robust enough to handle overfitting issues thanks to the bootstrapping and ensemble

approach. Random Forest is effective, understandable, and non-parametric for a variety of

dataset types [18] in addition to having excellent prediction accuracy. Compared to other
DECISION TREE AND RANDOM FOREST 7

prominent machine learning techniques, Random Forest offers a highly special combination of

model interpretability and prediction accuracy. Because ensemble techniques and random

sampling are used, accurate forecasts and superior generalization are made.

The generalization feature of the bagging method improves with lower variation and lower

overall generalization error. As a result, the boosting strategy accomplishes the reduction in bias

[19].

Three key characteristics of Random Forest have drawn attention [17]:

• Reliable forecasting outcomes for a range of applications

• A trained model can calculate the relative weights of each characteristic and determine

how close samples are to one another.

The classification performance of the Decision Tree (J48) and the Random Forest for big and

small datasets is discussed ahead in the paper. This comparison's goal is to provide a baseline

that will be helpful for the classification situations. Additionally, it will aid in choosing the right

model.

Advantages and Disadvantages of Decision Trees

ADVANTAGES:

1. Preparing the data for decision trees during pre-processing is easier than it is for other

algorithms.

2. Data normalization is not necessary for a decision tree.

3. Data scaling is not necessary when using a decision tree.


DECISION TREE AND RANDOM FOREST 8

4. Additionally, the process of creating a decision tree is not materially impacted by missing

values in the data.

5. Technical teams and stakeholders may easily understand a decision tree model since it is

so straightforward.

DISADVANTAGE:

1. An unstable decision tree might result from a little change in the data that has a huge

impact on the decision tree's structure.

2. Compared to other algorithms, a decision tree's calculations may become far more

complicated.

3. It takes more time to train a decision tree model.

4. Because of the intricacy and length of time required, decision tree training is

relatively costly.

5. Regression and the prediction of continuous values cannot be done well with the

Decision Tree method.

Advantages and Disadvantages of Random Forest

ADVANTAGES:

1. It provides variables importance, which assists in identifying the variables that have a

favorable influence.
DECISION TREE AND RANDOM FOREST 9

2. Over fitting is a common problem with machine learning models; random forest

classifiers would not.

3. It may be applied as a classification and regression model.

4. It handles null values.

5. When a class in the data is less frequent than other classes, it may automatically

balance data sets.

6. The approach is appropriate for challenging jobs since it quickly manages variables.

DISADVANTAGES:

1. The biggest drawback of random forest is that it might become too sluggish and

inefficient for real-time forecasts when there are a lot of trees.

2. Random forest is not a descriptive tool; it is a predictive modelling tool.

Difference between Decision Tree and Random Forest

Decision trees are graphs that show all potential outcomes of a decision using a branching

technique, which is a key distinction between them and the random forest algorithm. A series of

decision trees that operate in accordance with the output are produced by the random forest

method, in contrast.

Because the random forest approach is so precise and because modern computers and systems

can often handle big, previously unmanageable datasets, machine learning engineers and data

scientists frequently employ it in practice.


DECISION TREE AND RANDOM FOREST 10

The random forest algorithm's drawback is that you can't see the final model if your computer's

processing capacity is insufficient if your dataset is excessively huge. They can take a lot of time

to be created.

A basic decision tree has the advantage of being simple to understand. We are able to

immediately predict the conclusion since we know which variable and which value the variable

uses to split the data while we are building the decision tree. The models created by the random

forest method, on the other hand, are more complex since they combine different decision trees.

We must decide how many trees to generate and how many variables are required for each node

when creating a random forest method model. In general, adding more trees will increase

performance and predictability while decreasing calculation speed. The end solution for

regression issues is the average of all the trees. The sample in the tree target cell is the initial

level of means in a random forest method regression model, followed by all trees. In contrast to

linear regression, it estimates values beyond the observed range using prior observations.
DECISION TREE AND RANDOM FOREST 11

More trees are needed for more precise predictions, which slows down models. You would most

likely obtain an answer that was really near to the correct answer if there was a technique to

build several trees by averaging their responses. In this post, we examined the distinctions

between the decision tree and the random forest algorithms. A decision tree is a network structure

that use branching to deliver information in all conceivable ways. The random forest method, in

contrast, combines decision trees from all of their choices based on the outcome. A decision

tree's key benefit is that it can swiftly adapt to the dataset and that the final model can be seen

and comprehended sequentially.

Mathematics Involved in Decision Tree and Random Forest

Decision Tree

Decision trees use mathematics during the learning process. To begin, we need to identify a tree

structure and decision rules for each node using a dataset D = X, y. Each node will divide the

dataset into two or more disjoint subsets, each designated by the letter D (l,i)*, where l stands for

the layer number and i for the subset number. If every one of our labels in this subset belongs to

the same class, the subset is said to be PURE, this node is labeled as a leaf node, and this branch

of the tree has terminated. If not, the separation criteria will be applied once again.
DECISION TREE AND RANDOM FOREST 12

In some, more complicated datasets, to get to a stage in which every leaf node is pure may

require extremely deep decision trees that will lead to overfitting of the dataset. Due to this, we

often stop before we arrive at pure nodes and instead have to develop some more sophisticated

termination strategies in which the nodes may be impure but the error is accounted for and

measured. To reach a point where every leaf node is pure in some more complex datasets, it may

be necessary to build exceptionally deep decision trees, which will cause the over fitting of the

dataset. As a result, we frequently stop before reaching pure nodes and must instead develop

more complex termination techniques in which the nodes may be impure but the error is taken

into account and measured.

A tree is referred to as MONOTHETIC if just one characteristic is taken into account at

each node and POLYTHETIC if more than one are taken into considerations. Simpler trees are

typically preferred since they are simpler to read and use. Most programming languages allow

you to constrain a decision tree to produce either monothetic or polythetic trees, although it's

nearly always better to start with the simpler and only expand in complexity if absolutely

necessary. We remove an impurity-related factor in order to produce these trees.

Impurity Measures

The formula below is used to calculate Entropy impurity or information impurity the formula

below:

Equation 1
DECISION TREE AND RANDOM FOREST 13

This equation basically informs us how predictable each node in our tree is. In the end, we want

our notes to be predictable, and we do this by making sure a node has a sizable proportion of a

certain class.

Equation 2: Other Decision Tree Impurity formulas

Random Forest

Gini Index

The Gini index is a method for splitting out data; it evaluates the impurity or purity of data and is

used in CART (Classification and Regression Tree) algorithms such as Decision Tree. It

generates a binary split, which is then used by the CART algorithm.

As the root node, an attribute with a low Gini index is preferred.

Formula for calculating Gini index is:-

Equation 3
DECISION TREE AND RANDOM FOREST 14

Information Gain

Information gain is calculated using entropy in the data set and attribute entropy, and it tells us

how much information a feature offers us with a class.

Equation 4

Entropy measures how much unpredictability or impurity is there in the provided data. It is used

to select the root node in the decision tree where the data will be split equally.

Formula to calculate information gain

Information Gain= Entropy(S) - [(Weighted Avg) *Entropy(each feature) ]

Regression Problems

The Random Forest Algorithm is used to solve regression issues where the mean squared error

(MSE) value is used to determine how your data branches from each node.

Equation 5

Classification in Decision Tree and Random Forest

A Decision Tree represents a supervised classification strategy [20]. The concept was

inspired by the typical tree structure, which consists of a root, nodes (locations where branches

divide), branches, and leaves. Similar to this, a Decision Tree is built from nodes, which stand in
DECISION TREE AND RANDOM FOREST 15

for circles, and the segments that link the nodes, which stand in for the branches. A decision tree

is typically drawn from left to right, descends from the root, and travels downward. A root node

is the node from which the tree begins. The "leaf" node is the one at which the chain comes to an

end. Each internal node, or node that is not a leaf node, can extend two or more branches. While

the branches indicate a spectrum of values, a node represents a specific attribute. These value

ranges serve as dividing lines between the set of values for the specified attribute. Tree structure

is shown in Figure 1.

Figure 1: Tree Structure

The values of the attributes of the provided data are used to group the data in the

Decision Tree. The pre-classified data is used to create a Decision Tree. The attribute that split

the data into the most appropriate classes are chosen for classification. The division of the data

items is done by the values of these feature values. Recursively, this technique is done to every

divided subset of the data items. As soon as every data item in the current subset belongs to the

same class, the procedure is finished.

We employ WEKA's Decision Trees' J48 implementation (open source software). We can

examine data in WEKA, and in addition to this, it implements techniques for regression, data
DECISION TREE AND RANDOM FOREST 16

pre-processing, clustering, classification, and visualization. WEKA has more than sixty

algorithms accessible. An overview of a few Decision Tree-based algorithms is provided below.

REPTree

The splitting criterion for the decision/regression tree in REPTree is information

gain, and the pruning method is reduced error pruning. For numeric attributes, it only

sorts values once. C4.5 uses the approach of fractional instances to manage missing

values. REP Tree is a quick learner of Decision Trees.

Random Tree

An undetermined collection of alternative trees with K random attributes at each

node are combined to create a random tree. In this context, the term "at random" refers to

a group of trees where each tree has an equal probability of being sampled. Or we may

remark that the distribution of trees is uniform. Random trees may be produced

effectively, and combining several such random trees typically results in models that are

realistic. In the field of machine learning, there has been substantial study on random

trees in recent years.

J.48

The C4.5 algorithm, created by Ross Quinlan [21], is used to produce Decision

Trees. In the WEKA data mining tool, decision trees are generated using the J48, or Open

Source Java, version of C4.5 release [22]. This Decision Tree algorithm is typical.

Decision Tree Induction is one of the classification techniques used in data mining. From

the pre-classified data set, a model is inductively trained using the Classification

algorithm. The values of the attributes or features characterize each data item. One way to

think about classification is as a mapping from a collection of features to a certain class.


DECISION TREE AND RANDOM FOREST 17

Random Forests

Leo Breiman created the by Random Forest [3], which is a collection of unpruned

classification or regression trees created from a random sampling of training data. The features

chosen throughout the induction procedure are at random. The predictions of the ensemble are

combined (majority vote for classification, average for regression). Each tree is cultivated in

accordance with:

• If the training set contains N cases, but with replacement, then by randomly selecting N

cases. The training set for developing the tree will be this sample.

• The variable m is chosen so that for M input variables, mM is supplied at each node, m

variables are chosen at random from the M, and the best split on these m is used to divide

the node. The quantity m is maintained constant during the growth of the forest.

• Each tree is developed to its full potential. Pruning is not done.

Generally speaking, Random Forest performs significantly better than single tree classifiers like

C4.5. Its generalization error rate compares favorably to Adaboost's, although it is more noise-

resistant.

Classification Performance of the Experiment

The classification performance of the Decision Tree (J48) and the Random Forest for big

and small datasets is the main focus of this section. This comparison's goal is to provide a

baseline that will be helpful for the categorization situations. Additionally, it will aid in choosing

the right model.

Data Sets

We used these datasets from the UCI Machine Learning library for classification

issues [1]. Some features in the data on breast cancer are linear, whereas few are nominal.
DECISION TREE AND RANDOM FOREST 18

Each dataset's comprehensive description, properties, and source can be found in the UCI

repository. The twenty datasets we utilized for our research and comparison are listed in

Table 1 along with their names, instances, and attributes. The distribution of data

variables in the corresponding three sampled data sets is shown visually in Figures 2 and

3. Figure 2 displays the Lymphography Dataset. There are 148 total instances of it, 19

total attributes, and four classes. The Dataset Sonar is depicted in Figure 3 with 208

instances, 61 attributes, and biclasses data.

Table 1: Datasets

Name Instance Attribute


# #
Lymph 148 19
Autos 205 26
Sonar 208 61
Heart-h 270 14
Breast cancer 286 10
Heart-c 303 14
Ionosphere 351 35
Colic 368 23
Colic.org 368 28
Primary tumor 399 18
Balance Scale 625 25
Soybean 683 36
Figure 2: Lymph Dataset
Credit a 690 16
Breast W 699 10
Vehicle 846 19
Vowel 990 14
Credit g 1000 21
Segment 2310 20
Waveform 5000 41
Letter 20000 17

Figure 2: Sonar Dataset


DECISION TREE AND RANDOM FOREST 19

The J48 and the Random Forest both employ distinct parameter settings and variables. Binary

splits: Demonstrate the usage of binary splits in tree construction. Confidence factor: depicts tree

pruning; lower values depict greater pruning. If option is set to true, more information is shown

on the console under the heading "Debug." When reduced error pruning is employed, a seed is

implemented to randomize the data. Shows whether or not pruning is applied, unpruned. The

minimal number of instances per leaf is displayed by MinNumObj. Whether to preserve data for

visualization in an instance. numFolds: displays how much data was utilized for trimming.

Reduced error pruning: C.4.5 may or may not be substituted by reduced error pruning. Sub-tree

Raising: When pruning is done, sub-tree raising is employed. If counts at leaves are smoothed

based on Laplace, use Laplace. MaxDepth: displays the trees' maximum depth; zero is used to

represent infinite depth. Number of features utilized during random selection, numFeatures.

Number of trees to be created, numTrees. The random number that will be used as the seed value

is known as the seed.

Results and Discussion

We contrasted the Decision Tree and Random Forest classification outcomes. By employing 10-

fold cross validation, which repeats the procedure ten times using 9/10 of the data for training the

algorithm and the remaining data for testing, we were able to prevent the over-fitting issue.

Examples that were properly and wrongly identified using the Random Forest and Decision Tree

J48 classifiers are presented in Table 2. The name of the related dataset, the approximate number

of instances, and the approximate number of attributes are displayed in columns 2 and 3,

respectively. The classification results demonstrate that Decision Tree performs well with small

datasets, or fewer examples, whereas Random Forest performs better for the same amount of

attributes and large datasets, or more instances. According to the findings from the breast cancer
DECISION TREE AND RANDOM FOREST 20

data set, as the number of occurrences rose from 286 to 699, the Random Forest accurately

identified instances climbed from 69.23 percent to 96.13 percent.


Table 2: Comparison of the Random Forest and the Decision Forest results.
Serial Data Set No. of No. of Random Forest Decision Tree(J48) Results
NO instances attributes Correctly Incorrectly Correctly Incorrectly
classified classified classified classified
instances instances instances instances
1 Lymph 148 19 81.08% 18.91% 77.02% 22.97%

2 Autos 205 26 83.41% 16.58% 80.95% 18.04%

3 Sonar 208 61 80.77% 19.23% 71.15% 28.84%

4 Heart-h 270 14 77.89% 22.10% 80.95% 19.04%

5 Breast 286 10 69.23% 30.76% 75.52% 24.47%


cancer
6 Heart-c 303 14 81.51% 18.48% 77.56% 22.44%

7 Ionosphere 351 35 92.88% 7.12% 91.45% 8.54%

8 colic 368 23 86.14% 13.85% 85.32% 14.67%

9 Colic.org 368 28 68.47% 31.52% 66.30% 33.69%

10 Primary 399 18 42.48% 57.52% 39.82% 60.17%


tumor
11 Balance 625 25 80.48% 19.52% 76.64% 23.36%
Scale
12 Soyben 683 36 91.65% 8.34% 91.50% 8.49%

13 Credit a 690 16 85.07% 14.92% 86.09% 13.91%

14 Breast W 699 10 96.13% 3.68% 94.56% 5.43%

15 Vehicle 846 19 77.06% 22.93% 72.45% 27.54%

16 vowel 990 14 96.06% 3.03% 81.51% 18.48%

17 Credit g 1000 21 72.50% 27.50% 70.50% 29.50%

18 Segment 2310 20 97.66% 2.33% 96.92% 3.07%

19 Waveform 5000 41 81.94% 18.06% 75.30% 24.70%

20 Letter 20,000 17 94.71% 5.29% 87.98% 12.02%


DECISION TREE AND RANDOM FOREST 21

References

[1] A. Asuncion, & D. Newman, (2007). Retrieved July 26, 2022, from

https://archive.ics.uci.edu/ml/index.php

[2] Tin Kam Ho. (1998). The random subspace method for constructing decision forests. IEEE

Transactions on Pattern Analysis and Machine Intelligence, 20(8), 832-844.

doi:10.1109/34.709601

[3] Breiman, L. (2001). Machine Learning, 45(1), 5-32. doi:10.1023/a:1010933404324

[4] Mitchell, T. M. (1997). Machine learning. New York: MacGraw-Hill.

[5] Yael Ben-Haim, Elad Tom-Tov “A Streaming Parallel Decision Tree Algorithm” , 2010 Fp

[6] Breiman, L. (1996). Machine Learning, 24(2), 123-140. doi:10.1023/a:1018054314350

[7] Tin Kam Ho. (1998). The random subspace method for constructing decision forests. IEEE

Transactions on Pattern Analysis and Machine Intelligence, 20(8), 832-844.

doi:10.1109/34.709601

[8] Amit, Y., & Geman, D. (1997). Shape quantization and recognition with randomized trees.

Neural Computation, 9(7), 1545-1588. doi:10.1162/neco.1997.9.7.1545

[9] Lepetit, V., & Fua, P. (2006). Keypoint recognition using randomized trees. IEEE

Transactions on Pattern Analysis and Machine Intelligence, 28(9), 1465-1479.

doi:10.1109/tpami.2006.188

[10] Ozuysal, M., Fua, P., & Lepetit, V. (2007). Fast keypoint recognition in ten lines of code.

2007 IEEE Conference on Computer Vision and Pattern Recognition.

doi:10.1109/cvpr.2007.383123

[11] J. Winn and A. Criminisi, “Object class recognition at a glance,” CVPR, video track, 2006
DECISION TREE AND RANDOM FOREST 22

[12] Shotton, J., Johnson, M., & Cipolla, R. (2008). Semantic Texton forests for image

categorization and segmentation. 2008 IEEE Conference on Computer Vision and Pattern

Recognition. doi:10.1109/cvpr.2008.4587503

[13] Yin, P., Criminisi, A., Winn, J., & Essa, I. (2007). Tree-based classifiers for bilayer video

segmentation. 2007 IEEE Conference on Computer Vision and Pattern Recognition.

doi:10.1109/cvpr.2007.383008

[14] Bosch, A., Zisserman, A., & Munoz, X. (2007). Image classification using random forests

and Ferns. 2007 IEEE 11th International Conference on Computer Vision.

doi:10.1109/iccv.2007.4409066

[15] Apostoloff, N., & Zisserman, A. (2007). Who are you? - real-time person identification.

Procedings of the British Machine Vision Conference 2007. doi:10.5244/c.21.48

[16] Introduction to decision trees and random · pdf file introduction to decision trees and

random forests Ned Horning. ... decision trees tend to overfit training data which can give -

[PDF document]. (n.d.). Retrieved July 28, 2022, from

https://fdocuments.net/document/introduction-to-decision-trees-and-random-introduction-

to-decision-trees-and-random.html?page=1

[17] Breiman, L. (2001). Machine Learning, 45(1), 5-32. doi:10.1023/a:1010933404324

18] Random Forest for bioinformatics - Carnegie Mellon University. (n.d.). Retrieved July 28,

2022, from https://www.cs.cmu.edu/~qyj/papersA08/11-rfbook.pdf

[19] Yang, P., Hwa Yang, Y., B. Zhou, B., & Y. Zomaya, A. (2010). A review of Ensemble

Methods in Bioinformatics. Current Bioinformatics, 5(4), 296-308.

doi:10.2174/157489310794072508
DECISION TREE AND RANDOM FOREST 23

[20] Zhao, Y., & Zhang, Y. (2008). Comparison of decision tree methods for finding active

objects. Advances in Space Research, 41(12), 1955-1959. doi:10.1016/j.asr.2007.07.020

[22] C4.5 algorithm. (2022, February 10). Retrieved July 28, 2022, from

http://en.wikipedia.org/wiki/C4.5_algorithm

You might also like