[英]Count values across columns and then groupby year pandas
1 2 3 4 year
a h f h 2000
r r f h 2000
h y g h 2001
h i g e 2004
g f g b 2006
g d g v 2006
Is there a way in pandas to sum the frequency of each value by year? pandas有没有办法按年对每个值的频率求和?
I tried stack(), and groupby(), but that didn't work.我尝试了 stack() 和 groupby(),但没有用。 I'm not sure what is the next thing to try.我不确定接下来要尝试什么。 I don't think it's suited to cross tab.我认为它不适合交叉表。
Use DataFrame.melt
for unpivot with GroupBy.size
:将DataFrame.melt
用于 GroupBy.size 的反GroupBy.size
:
df = df.melt(id_vars='year').groupby(['year','value']).size().reset_index(name='count')
print (df)
year value count
0 2000 a 1
1 2000 f 2
2 2000 h 3
3 2000 r 2
4 2001 g 1
5 2001 h 2
6 2001 y 1
7 2004 e 1
8 2004 g 1
9 2004 h 1
10 2004 i 1
11 2006 b 1
12 2006 d 1
13 2006 f 1
14 2006 g 4
15 2006 v 1
Your solution should be changed:您的解决方案应该更改:
df1 = (df.set_index('year')
.stack()
.groupby(level=0)
.value_counts()
.rename_axis(['year','value'])
.reset_index(name='count'))
print (df1)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.