如何从数据集中查找重要列？

Question

I want to know how can we find out which is the significant column among the dataset.我想知道我们如何找出数据集中的重要列。 for eg.例如。 sepal length, sepal width, petal length, petal width, and species are the columns in dataset which is the significant column among the five of them.萼片长度、萼片宽度、花瓣长度、花瓣宽度和物种是数据集中的列，是这五个列中的重要列。

Answer 1

import pandas as pd
import seaborn as sns
from sklearn import datasets

iris = datasets.load_iris()

# merge data and target into dataframe
data = pd.DataFrame(iris.data, columns=iris.feature_names)
data['Target'] = iris.target

corelation_values = data.corr()

corr_heatmap = sns.heatmap(corelation_values, xticklabels=data.columns, yticklabels=data.columns)

The correlation heatmap output is as the following:相关热图 output 如下：

it is evident that all the other features in iris dataset are highly correlated with each other, so the most significant feature (with the most distinctive nature) is sepal width .很明显，iris 数据集中的所有其他特征都相互高度相关，因此最显着的特征（具有最独特的性质）是萼片宽度。

如何从数据集中查找重要列？

问题描述

1 个解决方案

解决方案1
0 已采纳 2020-12-10 18:51:30

如何从数据集中查找重要列？

问题描述

1 个解决方案

解决方案1 0 已采纳 2020-12-10 18:51:30

解决方案1
0 已采纳 2020-12-10 18:51:30