简体   繁体   English

panda dataframe 可视化数值列的不同 bin 中类别列的不同值的百分比

[英]panda dataframe visualize the percentage of different values of a categorial column within different bins of a numerical column

I have a pandas dataframe with two columns col1 and class .我有一个 pandas dataframe 有两列col1class class is binary. class是二进制的。 I want to plot a histogram and visualize the percentage of each one of class values on different bins of col1 column.我想 plot 一个直方图并可视化每个class值在col1列的不同 bin 上的百分比。 Here are my attempts:这是我的尝试:

1- Two histograms, one for each value of class column: 1- 两个直方图,一个用于class列的每个值:

df.col1[df.class == 0].hist()

在此处输入图像描述

df.col1[df.class == 1].hist()

在此处输入图像描述

2- Put them all (two values of class) together in one chart 2-将它们全部(类的两个值)放在一个图表中

df.groupby('class').col1.hist(alpha=0.9)

在此处输入图像描述

As you can see from the first two graphs, those rows with class==1 are rare comparing to another class==0 and when we put them together (third graph), we don't see their effect (look at those tiny orange areas in the chart).正如您从前两个图表中看到的那样,与另一个class==0相比,那些带有class==1的行很少见,当我们将它们放在一起时(第三张图),我们看不到它们的效果(看看那些小橙色图表中的区域)。 One solution is using the percentage of each value of class within each bin.一种解决方案是使用每个 bin 中class的每个值的百分比。 I tried this one:我试过这个:

df.groupby('class').col1.transform(lambda x: x/sum(x)).hist(alpha=0.9)

在此处输入图像描述

and apparently didn't work.显然没有工作。 I'm looking for a way to visualize the percentage of each class value within different bins.我正在寻找一种方法来可视化不同箱中每个 class 值的百分比。

Since the number of items per class is highly unbalanced, there is no way to have both plots in the same y-axes for histograms, If both clases have similar distribution among the values you can use distplots to perform some normalization on the data:由于每个 class 的项目数非常不平衡,因此无法将两个图都放在直方图的相同 y 轴上,如果两个类在值之间具有相似的分布,则可以使用distplots对数据执行一些归一化:

uniques = df['class'].unique()
targets = [df.col1[df['class'] == val] for val in uniques]

for target in targets:
    sns.distplot(target, rug=True)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM