繁体   English   中英

Python scikit 学习决策树

[英]Python scikit learn decision tree

`

import pandas as pd
import numpy as np
from sklearn.tree import *
from sklearn.model_selection import *
from sklearn.metrics import *
from sklearn.preprocessing import LabelEncoder
from seaborn import *
import matplotlib.pyplot as plt

d = {"Outlook" : ["Sunny", "Sunny", "Overcast", "Rain", "Rain", "Rain", "Overcast", "Sunny", "Sunny", "Rain", "Sunny", "Overcast", "Overcast", "Rain"],
     "Temperature" : ["Hot", "Hot", "Hot", "Mild", "Cool", "Cool", "Cool", "Mild", "Cool", "Mild", "Mild", "Mild", "Hot", "Mild"],
     "Humidity" : ["High", "High", "High", "High", "Normal", "Normal", "Normal", "High", "Normal", "Normal", "Normal", "High", "Normal", "High"],
     "Wind" : ["Weak", "Strong", "Weak", "Weak", "Weak", "Strong", "Strong", "Weak", "Weak", "Weak", "Strong", "Strong", "Weak", "Strong"],
     "Played football(yes/no)" : ["No", "No", "Yes", "Yes", "Yes", "No", "Yes", "No", "Yes", "Yes", "Yes", "Yes", "Yes", "No"]}
dataframe = pd.DataFrame(d)
lb = LabelEncoder()
dataframe["Outlook"] = lb.fit_transform(dataframe["Outlook"])

lz = LabelEncoder()
dataframe["Temperature"] = lz.fit_transform(dataframe["Temperature"])

la = LabelEncoder()
dataframe["Humidity"] = la.fit_transform(dataframe["Humidity"])

lc = LabelEncoder()
dataframe["Wind"] = lc.fit_transform(dataframe["Wind"])

lh = LabelEncoder()
dataframe["Played football(yes/no)"] = lh.fit_transform(dataframe["Played football(yes/no)"])

x = dataframe[["Outlook", "Temperature", "Humidity", "Wind"]]
y = dataframe["Played football(yes/no)"]

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2)

d = DecisionTreeClassifier()
d.fit(x_train, y_train)
y_pred = d.predict(x_test)

cf = confusion_matrix(y_test, y_pred)

z = plot_tree(d, filled=True, feature_names=x.columns)
plt.show()
`

决策树

从图中,不小于 0.5 的 outlook 将 go 向右,其他情况相同。 但是,由于不是数值数据,我应该如何计算 outlook 是否小于或大于 0.5?

转换后( label encoder ),所有特征都是数字的,所以决定将使用特征的这个数值(包括outlook

总而言之,当您使用决策树形式 sklearn 时,所有特征和拆分都将基于数值

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM