[英]Matplotlib: how to create stacked bar plot from pandas data frame?
从以下开始
df = pd.DataFrame( {'Item':['A','A','A','B','B','C','C','C','C'],
'Name': ['Tom','John','Paul','Tom','Frank','Tom', 'John', 'Richard', 'James'],
'Total':[3,3,3,2,2,4,4,4,4]})
print df
Item Name
0 A Tom
1 A John
2 A Paul
3 B Tom
4 B Frank
5 C Tom
6 C John
7 C Richard
8 C James
#merge M:N by column Item
df1 = pd.merge(df, df, on=['Item'])
#remove duplicity - column Name_x == Name_y
df1 = df1[~(df1['Name_x'] == df1['Name_y'])]
#print df1
#create lists
df1 = df1.groupby('Name_x')['Name_y'].apply(lambda x: x.tolist()).reset_index()
print df1
Name_x Name_y
0 Frank [Tom]
1 James [Tom, John, Richard]
2 John [Tom, Paul, Tom, Richard, James]
3 Paul [Tom, John]
4 Richard [Tom, John, James]
5 Tom [John, Paul, Frank, John, Richard, James]
我有一个如下数据框:
print df
Name People times
0 Frank [Tom] [1]
1 James [John, Richard, Tom] [1, 1, 1]
2 John [James, Paul, Richard, Tom] [1, 1, 1, 2]
3 Paul [John, Tom] [1, 1]
4 Richard [James, John, Tom] [1, 1, 1]
5 Tom [Frank, James, John, Paul, Richard] [1, 1, 2, 1, 1]
我想为每个Name
创建一个堆叠的条形图,以People
为条形,以times
为值。
我想做这样的事情
sub_df = df.groupby(['Name','People'])['Times'].sum().unstack()
sub_df.plot(kind='bar',stacked=True)
但它回来了
TypeError:无法散列的类型:'numpy.ndarray'
您必须在groupby
之后使用“ agg”的灵活类型申请:
df1['People'] = df1['Name_y'].apply(lambda x: tuple(x))
df1['Times'] = df1['Name_y'].apply(lambda x: [x.count(name) for name in list(set(x))])
s = df1.groupby(['Name_x','People']).apply(lambda x: sum(x.iloc[0]['Times']))
然后你得到以下
Name_x People
Frank (Tom,) 1
James (Tom, John, Richard) 3
John (Tom, Paul, Tom, Richard, James) 5
Paul (Tom, John) 2
Richard (Tom, John, James) 3
Tom (John, Paul, Frank, John, Richard, James) 6
dtype: int64
你可以随心所欲地绘图
s.plot(kind='bar', stacked=True)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.