[英]Pivot Pandas Dataframe adding columns
我有以下 dataframe:
date product ... cost quantity
2018-01-02 orange ... 7.5 2
2018-01-02 apples ... 10 5
2018-01-02 apples ... 12 4
2018-01-04 melon ... 6.5 10
2018-01-04 melon ... 5 4
2018-01-04 melon ... 3.2 3
...
我想创建以下 dataframe,其中每一行代表一个日期/产品组合,其中cost_x
和quantity_x
作为附加列添加。
更详细地说, cost_n
和quantity_n
表示与最后出现的列相关的成本和数量( n
是整数);
为了显示:
date product ... cost_0 quantity_0 cost_1 quantity_1 cost_2 quantity_2 ... cost_ n quantity_n
2018-01-02 orange ... 7.5 2 0 0 0 0 0 0
2018-01-02 apples ... 10 5 12 4 0 0 0 0
2018-01-04 melon ... 6.5 10 5 4 3.2 3 0 0
我怎样才能创造它?
只需修改user3483203
的答案
x = (df.assign(flag=df.groupby(['date', 'product']).cost.cumcount())
.pivot_table(index=['date', 'product'], columns='flag', values='cost', aggfunc='first')
.add_prefix('cost_'))
y = (df.assign(flag=df.groupby(['date', 'product']).cost.cumcount())
.pivot_table(index=['date', 'product'], columns='flag', values='quantity', aggfunc='first')
.add_prefix('quantity_'))
z = pd.merge(x,y, on = ['date', 'product']).reset_index()
z:
flag date product cost_0 cost_1 cost_2 quantity_0 quantity_1 quantity_2
0 2018-01-02 apples 10.0 12.0 NaN 5.0 4.0 NaN
1 2018-01-02 orange 7.5 NaN NaN 2.0 NaN NaN
2 2018-01-04 melon 6.5 5.0 3.2 10.0 4.0 3.0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.