简体   繁体   English

使用 matplotlib 的 100% 堆叠和分组条形图

[英]100% Stacked and grouped barplots using matplotlib

Sorry if this question is duplicated as I am not able to find a solution.抱歉,如果此问题重复,因为我无法找到解决方案。

I do have a data frame:我确实有一个数据框:

| sample_ids | perc_A | perc_B | perc_C |
|------------|--------|--------|--------|
| sample 1   | 0.75   | 0.18182| 0.42222|
| sample 2   | 0.66667| 0.24747| 0.15823|
| sample 3   | 0.70213| 0.28176| 0.17925|

With this, I would like to plot a 100% stacked and grouped bar chart (as shown below; a similar image taken from GitHub).有了这个,我想 plot 一个 100% 堆叠和分组的条形图(如下所示;类似的图像取自 GitHub)。 在此处输入图像描述


Detailed explanation based on the provided figure:根据提供的图进行详细说明:
Let's say sample 1 is Apples , for bar A, 75% will be in dark purple (legend: True_perc_a) while 25% will be in light purple (legend: False_perc_a);假设样本 1 是Apples ,对于柱 A,75% 将是深紫色(图例:True_perc_a),而 25% 将是浅紫色(图例:False_perc_a); for bar B, 18.19% will be in dark green (legend: True_perc_b) while 81.81% will be in light green (legend: False_perc_b);对于柱 B,18.19% 将是深绿色(图例:True_perc_b),而 81.81% 将是浅绿色(图例:False_perc_b); for bar C, 42.22% will be in dark yellow (legend: True_perc_c) while 57.78% will be in light yellow (legend: False_perc_c).对于柱状图 C,42.22% 为深黄色(图例:True_perc_c),而 57.78% 为浅黄色(图例:False_perc_c)。 The same conditions apply to sample 2 and sample 3.相同的条件适用于样品 2 和样品 3。

I was able to process the data to get the true and false perc.我能够处理数据以获得真假 perc。 For example:例如:

df['perc_A'] = (df['perc_A']*100).round(2)
df['perc_F_A'] = (100 - df['perc_A']).round(2)

However, I have some difficulties to plot the figure.但是,我对 plot 的图有些困难。

Because we know that the total percent will be 100%, we can just set the "False" value to 1. Then, we can melt the dataframe on the sample_ids column, rename the columns, and multiply everything by 100 (to make them percents).因为我们知道总百分比将是 100%,所以我们可以将“False”值设置为 1。然后,我们可以在sample_ids列上融化 dataframe,重命名列,然后将所有内容乘以 100(使它们成为百分比)。 From here, we will grab the "false" percents by choosing the values in the percs column that contains an F and then graph it with Seaborn so that we can set the hue to the perc name.从这里,我们将通过选择包含Fpercs列中的值来获取“错误”百分比,然后使用 Seaborn 绘制它,以便我们可以将色调设置为perc名称。 Set the palette to whatever colors you want and then set the alpha to 0.5 to make the difference between the true and false percents more apparent.将调色板设置为您想要的任何 colors,然后将 alpha 设置为 0.5 以使真假百分比之间的差异更加明显。 Then graph the true percents after (this places them in front of the false percent bars) and you have your stacked bar plot:然后在之后绘制真实百分比(这将它们放在错误百分比条的前面)并且您有堆叠条 plot:

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
plt.rcParams["figure.figsize"] = (20,10)

df = pd.DataFrame({'sample_ids':['sample1', 'sample2', 'sample3'], 
                   'perc_A':[0.75,0.66667,0.70213],
                   'perc_B':[0.18182,0.24747,0.28176],
                   'perc_C':[0.4222,0.15823,0.17925]})

df[['perc_F_A', 'perc_F_B', 'perc_F_C']] = 1

meltedDF = df.melt(id_vars=['sample_ids'], var_name='perc', value_name='percent')
meltedDF['percent']=meltedDF['percent']*100

sns.barplot(data=meltedDF[meltedDF.perc.str.contains('F')], x='sample_ids', y='percent', hue='perc', palette=['blue','green','red'], alpha=0.5)
sns.barplot(data=meltedDF[~meltedDF.perc.str.contains('F')], x='sample_ids', y='percent', hue='perc', palette=['blue','green','red'])
plt.show()

Graph:图形:

在此处输入图像描述

As an aside, a better way to generate your 'False" percents if you do want their true value would be to do this:顺便说一句,如果您确实想要它们的真实值,那么生成“错误”百分比的更好方法是执行以下操作:

df = pd.DataFrame({'sample_ids':['sample1', 'sample2', 'sample3'], 
                   'perc_A':[0.75,0.66667,0.70213],
                   'perc_B':[0.18182,0.24747,0.28176],
                   'perc_C':[0.4222,0.15823,0.17925]})

df[['perc_F_A', 'perc_F_B', 'perc_F_C']] = df.groupby('sample_ids').apply(lambda x: 1-x)

Output: Output:


   sample_ids   perc_A   perc_B     perc_C   perc_F_A   perc_F_B    perc_F_C
0   sample1     0.75000  0.18182    0.42220  0.25000    0.81818  0.57780
1   sample2     0.66667  0.24747    0.15823  0.33333    0.75253  0.84177
2   sample3     0.70213  0.28176    0.17925  0.29787    0.71824  0.82075

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM