[英]Filtering for values in Pivot table columns
If I wanted to aggregate values/sum a column by a certain time period, how do I do it using the pivot table? 如果我想在特定时间段内汇总值/汇总一列,该如何使用数据透视表来实现? For example in the table below, if I want the aggregate sum of fruits between 2000 - 2001, and 2002 - 2004, what code would I write?
例如,在下表中,如果我想要2000年-2001年和2002年-2004年之间的水果总和,我应该写什么代码? Currently I have this so far:
目前为止我有:
import pandas as pd
import numpy as np
UG = pd.read_csv('fruitslist.csv', index_col=2)
UG = UG.pivot_table(values = 'Count', index = 'Fruits', columns = 'Year', aggfunc=np.sum)
UG.to_csv('fruits.csv')
This returns counts for each fruit by each individual year, but I can't seem to aggregate by decade (eg 90s, 00s, 2010s) 每个水果每年的回报数,但我似乎无法按十年进行汇总(例如90年代,00年代,2010年代)
Fruits Count Year
Apple 4 1995
Orange 5 1996
Orange 6 2001
Guava 8 2003
Banana 6 2010
Guava 8 2011
Peach 7 2012
Guava 9 2013
Thanks in advance! 提前致谢!
This might help. 这可能会有所帮助。 Convert the
Year
column within a groupby
to decades and then aggregate. 将
groupby
的Year
列转换groupby
十年,然后进行汇总。
"""
Fruits Count Year
Apple 4 1995
Orange 5 1996
Orange 6 2001
Guava 8 2003
Banana 6 2010
Guava 8 2011
Peach 7 2012
Guava 9 2013
"""
df = pd.read_clipboard()
output = df.groupby([
df.Year//10*10,
'Fruits'
]).agg({
'Count' : 'sum'
})
print(output)
Count
Year Fruits
1990 Apple 4
Orange 5
2000 Guava 8
Orange 6
2010 Banana 6
Guava 17
Peach 7
If you want to group the years by a different amount, say every 2 years, just change the Year group: 如果要按不同的数量对年份进行分组,例如每2年,只需更改Year组即可:
print(df.groupby([
df.Year//2*2,
'Fruits'
]).agg({
'Count' : 'sum'
}))
Count
Year Fruits
1994 Apple 4
1996 Orange 5
2000 Orange 6
2002 Guava 8
2010 Banana 6
Guava 8
2012 Guava 9
Peach 7
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.