[英]How can I use pandas to count values for each date in a dataframe?
I'm very new to pandas library in python and I've been trying to piece together how to take a dataframe like this 我是python中pandas库的新手,我一直试图拼凑如何获取像这样的数据帧
'Date' 'Color'
0 '05-10-2017' 'Red'
1 '05-10-2017' 'Green'
2 '05-10-2017' 'Blue'
3 '05-10-2017' 'Red'
4 '05-10-2017' 'Blue'
5 '05-11-2017' 'Red'
6 '05-11-2017' 'Green'
7 '05-11-2017' 'Red'
8 '05-11-2017' 'Green'
9 '05-11-2017' 'Blue'
10 '05-11-2017' 'Blue'
11 '05-11-2017' 'Red'
12 '05-11-2017' 'Blue'
13 '05-11-2017' 'Blue'
14 '05-12-2017' 'Green'
15 '05-12-2017' 'Blue'
16 '05-12-2017' 'Red'
17 '05-12-2017' 'Blue'
18 '05-12-2017' 'Blue'
and output one that has unique dates as an index, the colors as column headers the value count for each day like this: 并输出一个具有唯一日期作为索引,颜色作为列标题,每天的值计数如下:
'Date' 'Red' 'Green' 'Blue'
'05-10-2017' 2 1 2
'05-11-2017' 3 2 3
'05-12-2017' 1 1 3
I have been struggling searching through this site for the past two days trying to piece together a way to achieve this and so far I have only been able to generate the index of unique dates. 过去两天我一直在努力搜索这个网站,试图拼凑出一种实现这一目标的方法,到目前为止我只能生成独特日期的索引。 I'm having some trouble with value_counts.
我在使用value_counts时遇到了一些麻烦。 I would appreciate if anybody would be able to show me a method, or point me in the right direction if this has been answered already.
如果有人能够向我展示一种方法,或者如果已经回答了这个方法,我会很感激。 I have exhausted my searching capabilities and have finally resolved to ask my first question ever on here.
我已经用尽了我的搜索能力,终于决定在这里问我的第一个问题。 If I'm an idiot, please be gentle.
如果我是个白痴,请保持温柔。
You can use: 您可以使用:
1. 1。
groupby
+ size
for aggregting and unstack
for reshape: groupby
+ size
为aggregting和unstack
的重塑:
df1 = df.groupby(["'Date'","'Color'"]).size().unstack(fill_value=0)
print (df1)
'Color' 'Blue' 'Green' 'Red'
'Date'
'05-10-2017' 2 1 2
'05-11-2017' 4 2 3
'05-12-2017' 3 1 1
2. 2。
pivot_table
solution: pivot_table
解决方案:
df1 = df.pivot_table(index="'Date'",columns="'Color'", aggfunc='size')
print (df1)
'Color' 'Blue' 'Green' 'Red'
'Date'
'05-10-2017' 2 1 2
'05-11-2017' 4 2 3
'05-12-2017' 3 1 1
3. 3。
crosstab
solution, slowier: crosstab
解决方案,更慢:
df1 = pd.crosstab(df["'Date'"],df["'Color'"])
print (df1)
'Color' 'Blue' 'Green' 'Red'
'Date'
'05-10-2017' 2 1 2
'05-11-2017' 4 2 3
'05-12-2017' 3 1 1
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.