[英]How to plot density plot by label (categorical variable) for each numeric column?
I have tried to use the mines and rocks data ( http://archive.ics.uci.edu/ml/datasets/connectionist+bench+(sonar,+mines+vs.+rocks) ) to do EDA. 我尝试使用地雷和岩石数据( http://archive.ics.uci.edu/ml/datasets/connectionist+bench+(sonar,+mines+vs.+rocks) )进行EDA。 I have put the following code that can plot the density plot for each numeric column.
我放置了以下代码,可以绘制每个数字列的密度图。
Is there a way to plot the same chart for each numeric variable in the data set but with two lines in each density plot based on if it is M or R (the last column). 有没有一种方法可以为数据集中的每个数字变量绘制相同的图表,但是根据密度是M还是R(最后一列),每个密度图中有两条线。 Therefore we can see which variable shows different distribution for the label M vs R.
因此,我们可以看到哪个变量显示了标签M与R的不同分布。
import pandas as pd
# import file
file = 'https://archive.ics.uci.edu/ml/machine-learning-
databases/undocumented/connectionist-bench/sonar/sonar.all-data'
mr_df = pd.read_table(file, sep=',', header=None)
mr_df.plot(kind='density', subplots=True, layout=(8,8), sharex=False, legend=False, fontsize=1, figsize=(12,12))
plt.savefig('density plot.png')
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.