简体   繁体   English

如何将第二个轴添加到 matplotlib/seaborn 条形图并使次要点与正确的条形对齐?

[英]How do I add a second axis to a matplotlib/seaborn bar chart and have the secondary points align with the correct bars?

I wrote a (newbie) python function (below) to draw a bar chart broken out by a primary and possibly a secondary dimension.我编写了一个(新手)python 函数(如下)来绘制由主要维度和可能是次要维度划分的条形图。 For example, the image below charts the percentage of people in each gender who have attained a specific level of education.例如,下图显示了每个性别获得特定教育水平的人数百分比。

Question: how do I overlay on each bar the median household size for that subgroup eg place a point signifying the value '3' on the College/Female bar.问题:我如何在每个条形上叠加该子组的家庭人数中位数,例如在大学/女性条形上放置一个表示值“3”的点。 None of the examples I have seen accurately overlay the point on the correct bar.我见过的例子中没有一个能准确地覆盖正确条上的点。

I'm extremely new to this, so thank you very much for your help!我对此非常陌生,因此非常感谢您的帮助!

df = pd.DataFrame({'Student'       : ['Alice', 'Bob', 'Chris',  'Dave',    'Edna',    'Frank'], 
                   'Education'     : ['HS',    'HS',  'HS',     'College', 'College', 'HS'   ],
                   'Household Size': [4,        4,     3,        3,         3,         6     ],
                   'Gender'        : ['F',     'M',   'M',      'M',       'F',       'M'    ]});


def MakePercentageFrequencyTable(dataFrame, primaryDimension, secondaryDimension=None, extraAggregatedField=None):
    lod = dataFrame.groupby([secondaryDimension]) if secondaryDimension is not None else dataFrame

    primaryDimensionPercent = lod[primaryDimension].value_counts(normalize=True) \
                         .rename('percentage') \
                         .mul(100) \
                         .reset_index(drop=False);

    if secondaryDimension is not None:
        primaryDimensionPercent = primaryDimensionPercent.sort_values(secondaryDimension)
        g = sns.catplot(x="percentage", y=secondaryDimension, hue=primaryDimension, kind='bar', data=primaryDimensionPercent)
    else:
        sns.catplot(x="percentage", y='index', kind='bar', data=primaryDimensionPercent)
        
MakePercentageFrequencyTable(dataFrame=df,primaryDimension='Education', secondaryDimension='Gender')

# Question: I want to send in extraAggregatedField='Household Size' when I call the function such that 
# it creates a secondary 'Household Size' axis at the top of the figure
# and aggregates/integrates the 'Household Size' column such that the following points are plotted
# against the secondary axis and positioned over the given bars:
#
# Female/College => 3
# Female/High School => 4
# Male/College => 3
# Male/High School => 4

Picture of what I have been able to achieve so far到目前为止我已经能够实现的目标的图片

You will have to use the axes-level functions sns.barplot() and sns.stripplot() rather than catplot() , which creates a new figure and a FacetGrid .您将不得不使用轴级函数sns.barplot()sns.stripplot()而不是catplot() ,后者创建一个新图形和FacetGrid

Something like this:像这样的东西:

df = pd.DataFrame({'Student'       : ['Alice', 'Bob', 'Chris',  'Dave',    'Edna',    'Frank'], 
                   'Education'     : ['HS',    'HS',  'HS',     'College', 'College', 'HS'   ],
                   'Household Size': [4,        4,     3,        3,         3,         6     ],
                   'Gender'        : ['F',     'M',   'M',      'M',       'F',       'M'    ]});


def MakePercentageFrequencyTable(dataFrame, primaryDimension, secondaryDimension=None, extraAggregatedField=None, ax=None):
    ax = plt.gca() if ax is None else ax
    lod = dataFrame.groupby([secondaryDimension]) if secondaryDimension is not None else dataFrame

    primaryDimensionPercent = lod[primaryDimension].value_counts(normalize=True) \
                         .rename('percentage') \
                         .mul(100) \
                         .reset_index(drop=False);

    if secondaryDimension is not None:
        primaryDimensionPercent = primaryDimensionPercent.sort_values(secondaryDimension)
        ax = sns.barplot(x="percentage", y=secondaryDimension, hue=primaryDimension, data=primaryDimensionPercent, ax=ax)
    else:
        ax = sns.barplot(x="percentage", y='index', data=primaryDimensionPercent, ax=ax)
    
    if extraAggregatedField is not None:
        ax2 = ax.twiny()
        extraDimension = dataFrame.groupby([primaryDimension, secondaryDimension]).mean().reset_index(drop=False)
        ax2 = sns.stripplot(data=extraDimension, x=extraAggregatedField, y=secondaryDimension, hue=primaryDimension, 
                            ax=ax2,dodge=True, edgecolors='k', linewidth=1, size=10)


plt.figure()
MakePercentageFrequencyTable(dataFrame=df,primaryDimension='Education', secondaryDimension='Gender', extraAggregatedField='Household Size')

在此处输入图片说明

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 我在matplotlib(python)中有一个带有误差条的条形图,但我希望误差条位于该条的中间。 我该怎么做呢? 谢谢 - I have a bar chart with error bars in matplotlib (python), but i want the error bars in the middle of the bar. how do i do this? Thanks 如何在 matplotlib 中为第二个函数创建辅助轴? - How do I create a secondary axis for a second function in matplotlib? 如何获取 matplotlib 条形图中的所有条形? - How do I get all bars in a matplotlib bar chart? matplotlib当x轴相隔1周时,如何减少堆积条形图中条形之间的空间量? - matplotlib how do I reduce the amount of space between bars in a stacked bar chart when x-axis are dates 1-week apart? 如何在刻度(matplotlib)之间对齐条形图中的条? - How to align the bars in a bar chart between ticks (matplotlib)? 如何在Pygal的堆积条形图中为条形添加值? - How do I add values to the bars in a Stacked Bar chart in Pygal? 如何使用 matplotlib 在此条形图的相应条形上方显示这些值? 我的尝试不起作用 - How do I display these values above their respective bars on this bar chart with matplotlib? My attempts are not working 如何避免在matplotlib中的多条形图中的条形重叠 - How do I avoid overlap between bars in a multi-bar chart in matplotlib 如何删除条形图中条形之间的x轴间距? - How do I remove the x axis spacing between bars in a bar chart? 使用堆积条形图在辅助轴上绘制线条 - matplotlib - Plot line on secondary axis with stacked bar chart - matplotlib
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM