简体   繁体   English

跨列计数值,然后按年份分组 pandas

[英]Count values across columns and then groupby year pandas

1   2   3   4   year
a   h   f   h   2000
r   r   f   h   2000
h   y   g   h   2001
h   i   g   e   2004
g   f   g   b   2006
g   d   g   v   2006

Is there a way in pandas to sum the frequency of each value by year? pandas有没有办法按年对每个值的频率求和?

I tried stack(), and groupby(), but that didn't work.我尝试了 stack() 和 groupby(),但没有用。 I'm not sure what is the next thing to try.我不确定接下来要尝试什么。 I don't think it's suited to cross tab.我认为它不适合交叉表。

Use DataFrame.melt for unpivot with GroupBy.size :DataFrame.melt用于 GroupBy.size 的反GroupBy.size

df = df.melt(id_vars='year').groupby(['year','value']).size().reset_index(name='count')
print (df)
    year value  count
0   2000     a      1
1   2000     f      2
2   2000     h      3
3   2000     r      2
4   2001     g      1
5   2001     h      2
6   2001     y      1
7   2004     e      1
8   2004     g      1
9   2004     h      1
10  2004     i      1
11  2006     b      1
12  2006     d      1
13  2006     f      1
14  2006     g      4
15  2006     v      1

Your solution should be changed:您的解决方案应该更改:

df1 = (df.set_index('year')
         .stack()
         .groupby(level=0)
         .value_counts()
         .rename_axis(['year','value'])
         .reset_index(name='count'))
print (df1)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM