Python scikit 学习决策树

Question

`

import pandas as pd
import numpy as np
from sklearn.tree import *
from sklearn.model_selection import *
from sklearn.metrics import *
from sklearn.preprocessing import LabelEncoder
from seaborn import *
import matplotlib.pyplot as plt

d = {"Outlook" : ["Sunny", "Sunny", "Overcast", "Rain", "Rain", "Rain", "Overcast", "Sunny", "Sunny", "Rain", "Sunny", "Overcast", "Overcast", "Rain"],
     "Temperature" : ["Hot", "Hot", "Hot", "Mild", "Cool", "Cool", "Cool", "Mild", "Cool", "Mild", "Mild", "Mild", "Hot", "Mild"],
     "Humidity" : ["High", "High", "High", "High", "Normal", "Normal", "Normal", "High", "Normal", "Normal", "Normal", "High", "Normal", "High"],
     "Wind" : ["Weak", "Strong", "Weak", "Weak", "Weak", "Strong", "Strong", "Weak", "Weak", "Weak", "Strong", "Strong", "Weak", "Strong"],
     "Played football(yes/no)" : ["No", "No", "Yes", "Yes", "Yes", "No", "Yes", "No", "Yes", "Yes", "Yes", "Yes", "Yes", "No"]}
dataframe = pd.DataFrame(d)
lb = LabelEncoder()
dataframe["Outlook"] = lb.fit_transform(dataframe["Outlook"])

lz = LabelEncoder()
dataframe["Temperature"] = lz.fit_transform(dataframe["Temperature"])

la = LabelEncoder()
dataframe["Humidity"] = la.fit_transform(dataframe["Humidity"])

lc = LabelEncoder()
dataframe["Wind"] = lc.fit_transform(dataframe["Wind"])

lh = LabelEncoder()
dataframe["Played football(yes/no)"] = lh.fit_transform(dataframe["Played football(yes/no)"])

x = dataframe[["Outlook", "Temperature", "Humidity", "Wind"]]
y = dataframe["Played football(yes/no)"]

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2)

d = DecisionTreeClassifier()
d.fit(x_train, y_train)
y_pred = d.predict(x_test)

cf = confusion_matrix(y_test, y_pred)

z = plot_tree(d, filled=True, feature_names=x.columns)
plt.show()
`

从图中，不小于 0.5 的 outlook 将 go 向右，其他情况相同。 但是，由于不是数值数据，我应该如何计算 outlook 是否小于或大于 0.5？

Answer 1

转换后（ label encoder ），所有特征都是数字的，所以决定将使用特征的这个数值（包括outlook ）

总而言之，当您使用决策树形式 sklearn 时，所有特征和拆分都将基于数值。

Python scikit 学习决策树

问题描述

1 个解决方案

解决方案1
0 2021-01-19 07:18:10

Python scikit 学习决策树

问题描述

1 个解决方案

解决方案1 0 2021-01-19 07:18:10

解决方案1
0 2021-01-19 07:18:10