[英]pandas grouping and visualization
I have to do some analysis using Python3 and pandas with a dataset which is shown as a toy example-我必须使用 Python3 和 pandas 以及显示为玩具示例的数据集进行一些分析-
data
'''
location importance agent count
0 London Low chatbot 2
1 NYC Medium chatbot 1
2 London High human 3
3 London Low human 4
4 NYC High human 1
5 NYC Medium chatbot 2
6 Melbourne Low chatbot 3
7 Melbourne Low human 4
8 Melbourne High human 5
9 NYC High chatbot 5
'''
My aim is to group the location and then count the number of Low, Medium and/or High 'importance' column for each location.我的目标是将位置分组,然后计算每个位置的低、中和/或高“重要性”列的数量。 So far, the code I have come up with is-到目前为止,我想出的代码是-
data.groupby(['location', 'importance']).aggregate(np.size)
'''
agent count
location importance
London High 1 1
Low 2 2
Melbourne High 1 1
Low 2 2
NYC High 2 2
Medium 2 2
'''
This grouping and count aggregation contains index as the grouping objects-此分组和计数聚合包含索引作为分组对象-
data.groupby(['location', 'importance']).aggregate(np.size).index
I don't know how to proceed next?不知道下一步怎么走? Also, how can I visualize this?另外,我怎样才能想象这个?
Help?帮助?
I think you need DataFrame.pivot_table
, added aggfunc=sum
for aggregate if duplicates and then use DataFrame.plot
:我认为您需要DataFrame.pivot_table
,如果重复,则为聚合添加aggfunc=sum
,然后使用DataFrame.plot
:
df = data.pivot_table(index='location', columns='importance', values='count', aggfunc='sum')
df.plot()
If need counts of pairs location
with importance
use crosstab
:如果需要具有importance
的对location
的计数,请使用crosstab
:
df = pd.crosstab(data['location'], data['importance'])
df.plot()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.