[英]How to merge several rows into one row based on a column with specific value in Pandas
I have a DataFrame like this way: 我有这样的DataFrame:
item_id revenue month year
1 10.0 01 2014
1 5.0 02 2013
1 6.0 04 2013
1 7.0 03 2013
2 2.0 01 2013
2 3.0 03 2013
3 5.0 04 2013
And I try to get the revenue of each item from January to March 2013 like following DataFrame: 我尝试从2013年1月到3月获得每个项目的收入,如下面的DataFrame:
item_it revenue year
1 12.0 2013
2 5.0 2013
3 0 2013
BUT, I am confused on how to implement it in Pandas. 但是,我对如何在熊猫中实现它感到困惑。 Any help would be appreciated.
任何帮助,将不胜感激。
You can slice first, then groupby
and reindex
to include 0
values. 您可以先切片, 然后
groupby
并reindex
以包含0
值。
month_start, month_end = 1, 3
year = 2013
res = df.loc[df['month'].between(month_start, month_end) & df['year'].eq(year)]\
.groupby('item_id')['revenue'].sum()\
.reindex(df['item_id'].unique()).fillna(0)\
.reset_index('revenue').assign(year=year)
print(res)
item_id revenue year
0 1 12.0 2013
1 2 5.0 2013
2 3 0.0 2013
You can use groupby
first then sum
method to get the desire output. 你可以先使用
groupby
然后使用sum
方法来获得所需的输出。
df.groupby(['year', 'item_id']).sum().reset_index().drop('month', axis=1).set_index('item_id')
year revenue
item_id
1 2013 18.0
2 2013 5.0
3 2013 5.0
1 2014 10.0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.