Seaborn 归一化条形图

Question

I have a dataframe with two columns containing True and False and one column containing genders: Males and Females.我有一个 dataframe，其中两列包含 True 和 False，一列包含性别：男性和女性。

I'm trying to count the number of True for each column for each gender but normalized by the number of each gender.我正在尝试计算每个性别的每列的 True 数量，但通过每个性别的数量进行标准化。

What I did so far is to normalize my data against the whole datafame df_up .到目前为止，我所做的是将我的数据与整个 datafame df_up 。 But how do I normalize each separately against the number of each gender?但是我如何根据每个性别的数量分别标准化每个？

percentage = lambda x: sum(x) / len(df_up)
ax6 = sns.barplot(x="value", y="variable", hue="Gender", data=melted_fan, estimator=percentage, ci=None, palette=palette)

Answer 1

I am guessing this is what you did:我猜这就是你所做的：

import seaborn as sns
import numpy as np
import pandas as pd
df = pd.DataFrame({'Gender':np.random.choice(["Female","Male"],100),
                  'star_wars_fan':np.random.choice([True,False],100),
                   'star_trek_fan':np.random.choice([True,False],100)
                  })

melted_fan = df.groupby('Gender').agg(sum).reset_index().melt(id_vars="Gender")
melted_fan

    Gender  variable    value
0   Female  star_wars_fan   29.0
1   Male    star_wars_fan   16.0
2   Female  star_trek_fan   26.0
3   Male    star_trek_fan   29.0

sns.barplot(x="value", y="variable", hue="Gender", 
                  data=melted_fan, ci=None)

Unfortunately in sns.barplot, it is split into the subgroups and the estimator is a function applied to each group, so it's hard to use that.不幸的是，在 sns.barplot 中，它被分成子组，估计器是应用于每个组的 function，所以很难使用它。 An easier way is to calculate the percentage before plotting:一种更简单的方法是在绘图之前计算百分比：

melted_fan['perc'] =  melted_fan.groupby('variable')['value'].apply(lambda x:100*x/x.sum())
sns.barplot(x="value", y="variable", hue="Gender", 
                  data=melted_fan, ci=None)

Answer 2

This kind of barplot could be constructed via pandas plotting:这种条形图可以通过 pandas 绘图构建：

import matplotlib.pyplot as plt
from matplotlib.ticker import PercentFormatter
import pandas as pd
import numpy as np

N = 1000
df = pd.DataFrame({'Star Wars': np.random.randint(0, 2, N, dtype=np.bool),
                   'Star Trek': np.random.randint(0, 2, N, dtype=np.bool),
                   'Gender': np.random.choice(['Male', 'Female'], N, p=[0.6, 0.4])
                   })
ax = df.groupby(['Gender'])[['Star Wars', 'Star Trek']].agg('mean').transpose().plot(kind='barh')
ax.xaxis.set_major_formatter(PercentFormatter(1))
plt.show()

Answer 3

The easiest way I found is the pd.cross_tab function that calculates the fractions.我发现的最简单的方法是计算分数的 pd.cross_tab function。 Then you can easily make a stacked barplot.然后您可以轻松制作堆叠条形图。

Something like this:像这样的东西：

cross_tab = pd.crosstab(index=data['release_year'],
                        columns=data['type'])

cross_tab_prop.plot(kind='bar', 
                    stacked=True, 
                    colormap='tab10', 
                    figsize=(10, 6))

It's very well explained here:这里解释得很好：

https://towardsdatascience.com/100-stacked-charts-in-python-6ca3e1962d2b https://towardsdatascience.com/100-stacked-charts-in-python-6ca3e1962d2b

Seaborn 归一化条形图

问题描述

3 个解决方案

解决方案1
2 2020-05-21 21:10:01

解决方案2
1 已采纳 2020-05-21 21:12:47

解决方案3
0 2022-08-11 09:57:00

Seaborn 归一化条形图

问题描述

3 个解决方案

解决方案1 2 2020-05-21 21:10:01

解决方案2 1 已采纳 2020-05-21 21:12:47

解决方案3 0 2022-08-11 09:57:00

解决方案1
2 2020-05-21 21:10:01

解决方案2
1 已采纳 2020-05-21 21:12:47

解决方案3
0 2022-08-11 09:57:00