ArbolesRF UNI

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

Untitled

July 3, 2020

Importamos las librerías


[1]: import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

Importamos la data
[2]: df = pd.read_csv('kyphosis.csv')

[3]: df.head()

[3]: Kyphosis Age Number Start


0 absent 71 3 5
1 absent 158 3 14
2 present 128 4 5
3 absent 2 5 1
4 absent 1 4 15

Solo veremos un diagrama de pares simple para este pequeño conjunto de datos.
[4]: sns.pairplot(df,hue='Kyphosis',palette='Set1')

[4]: <seaborn.axisgrid.PairGrid at 0x1850068cd48>

1
¡Dividamos los datos en un conjunto de entrenamiento y un conjunto de prueba!
[5]: from sklearn.model_selection import train_test_split

[6]: X = df.drop('Kyphosis',axis=1)
y = df['Kyphosis']

[7]: X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30)

0.1 Árboles de decisión


Comenzaremos entrenando un solo árbol de decisión.
[9]: from sklearn.tree import DecisionTreeClassifier

[14]:

2
dtree = DecisionTreeClassifier(class_weight=None, criterion='gini',␣
,→max_depth=None,

max_features=None, max_leaf_nodes=None, min_samples_leaf=1,


min_samples_split=2, min_weight_fraction_leaf=0.0,␣
,→random_state=None, splitter='best')

[15]: dtree.fit(X_train,y_train)

[15]: DecisionTreeClassifier()

0.2 Predicción y evaluación


Vamos a evaluar nuestro árbol de decisiones.
[16]: predictions = dtree.predict(X_test)

[17]: from sklearn.metrics import classification_report,confusion_matrix

[18]: print(confusion_matrix(y_test,predictions))

[[17 3]
[ 3 2]]

0.3 Random Forests


Ahora comparemos el modelo de árbol de decisión con un bosque aleatorio.
[37]: from sklearn.ensemble import RandomForestClassifier
rfc = RandomForestClassifier(bootstrap=True, class_weight=None,␣
,→criterion='gini',

max_depth=None, max_features='auto', max_leaf_nodes=None,


min_samples_leaf=1, min_samples_split=2,
min_weight_fraction_leaf=0.0, n_estimators=130, n_jobs=1,
oob_score=False, random_state=None, verbose=0,
warm_start=False)

[38]: rfc.fit(X_train, y_train)

[38]: RandomForestClassifier(n_estimators=130, n_jobs=1)

[39]: rfc_pred = rfc.predict(X_test)

[41]: print(confusion_matrix(y_test,rfc_pred))

[[20 0]
[ 5 0]]

You might also like