简体   繁体   English

Matplotlib:如何从熊猫数据框创建堆叠条形图?

[英]Matplotlib: how to create stacked bar plot from pandas data frame?

Starting from the following 从以下开始

df = pd.DataFrame( {'Item':['A','A','A','B','B','C','C','C','C'], 
    'Name': ['Tom','John','Paul','Tom','Frank','Tom', 'John', 'Richard', 'James'],
    'Total':[3,3,3,2,2,4,4,4,4]})
print df
  Item     Name
0    A      Tom
1    A     John
2    A     Paul
3    B      Tom
4    B    Frank
5    C      Tom
6    C     John
7    C  Richard
8    C    James

#merge M:N by column Item
df1 = pd.merge(df, df, on=['Item'])

#remove duplicity - column Name_x == Name_y
df1 = df1[~(df1['Name_x'] == df1['Name_y'])]
#print df1

#create lists
df1 = df1.groupby('Name_x')['Name_y'].apply(lambda x: x.tolist()).reset_index()
print df1
    Name_x                                     Name_y
0    Frank                                      [Tom]
1    James                       [Tom, John, Richard]
2     John           [Tom, Paul, Tom, Richard, James]
3     Paul                                [Tom, John]
4  Richard                         [Tom, John, James]
5      Tom  [John, Paul, Frank, John, Richard, James]

I have a dataframe as the following: 我有一个如下数据框:

print df 
      Name                               People            times
0    Frank                                [Tom]              [1]
1    James                 [John, Richard, Tom]        [1, 1, 1]
2     John          [James, Paul, Richard, Tom]     [1, 1, 1, 2]
3     Paul                          [John, Tom]           [1, 1]
4  Richard                   [James, John, Tom]        [1, 1, 1]
5      Tom  [Frank, James, John, Paul, Richard]  [1, 1, 2, 1, 1]

I want to create a stacked bar plot for each Name considering People as bar and times as values. 我想为每个Name创建一个堆叠的条形图,以People为条形,以times为值。

I want to do something like this 我想做这样的事情

sub_df = df.groupby(['Name','People'])['Times'].sum().unstack()
sub_df.plot(kind='bar',stacked=True)

but it returns 但它回来了

TypeError: unhashable type: 'numpy.ndarray' TypeError:无法散列的类型:'numpy.ndarray'

You have to use apply for the flexible type of 'agg' after groupby : 您必须在groupby之后使用“ agg”的灵活类型申请:

df1['People'] = df1['Name_y'].apply(lambda x: tuple(x))
df1['Times'] = df1['Name_y'].apply(lambda x: [x.count(name) for name in list(set(x))])
s = df1.groupby(['Name_x','People']).apply(lambda x: sum(x.iloc[0]['Times']))

Then you get the following 然后你得到以下

Name_x   People                                   
Frank    (Tom,)                                       1
James    (Tom, John, Richard)                         3
John     (Tom, Paul, Tom, Richard, James)             5
Paul     (Tom, John)                                  2
Richard  (Tom, John, James)                           3
Tom      (John, Paul, Frank, John, Richard, James)    6
dtype: int64

And you can plot as you like 你可以随心所欲地绘图

s.plot(kind='bar', stacked=True)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 从 Pandas 数据框中绘制堆积条形图 - Plot stacked bar chart from pandas data frame 如何在 matplotlib/pandas 中以百分比形式制作 dataframe 值的堆叠条 plot - How to make stacked bar plot of dataframe values as percentage in matplotlib/pandas 在熊猫中创建堆叠的条形图 - Create stacked bar plot in pandas 在 python 中绘制条形 plot 时,如何在 3 列的 pandas 数据框中将其堆叠为 2 列而不是为一列堆叠? - While plotting bar plot in python, how can I make it stacked for 2 columns and not stacked for one column in a pandas data frame of 3 columns? matplotlib:在条形图上绘制多列熊猫数据框 - matplotlib: plot multiple columns of pandas data frame on the bar chart 从分组的pandas数据框中绘制堆积图 - Plotting stacked plot from grouped pandas data frame 制作从 seaborn 到 matplotlib 的堆叠条 plot - Make a stacked bar plot from seaborn to matplotlib 如何在matplotlib中获取条形图/堆积条形图上的标签? - How to get the label on bar plot/stacked bar plot in matplotlib? 通过分组数据与大熊猫堆积条形图 - Stacked bar plot by grouped data with pandas PYTHON - 在堆叠的水平条中显示 100 的百分比 plot 来自 matplotlib 中的 Z3A43B2C088322D240C2 - PYTHON - Display percent of 100 in stacked horizontal bar plot from crosstab from matplotlib in pandas
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM