[英]How to merge two rows of a pandas dataframe depending on a condition in Python?
我有一個dataframe
:
order_creationdate orderid productid quantity prod_name price Amount
0 2021-01-18 22:27:03.341260 1 SnyTV 3.0 Sony LED TV 412.0 1236.0
1 2021-01-18 17:28:03.343089 1 AMDR5 1.0 AMD Ryzen 5 313.0 313.0
2 2021-01-18 13:19:03.343842 1 INTI0 8.0 Intel I10 146.0 1168.0
3 2021-01-18 10:24:03.344399 1 INTI0 5.0 Intel I10 146.0 730.0
4 2021-01-18 12:29:03.344880 1 CMCFN 4.0 coolermaster CPU FAN 675.0 2700.0
索引 2 和 3 具有相同的產品 ID,因此其順序相同,因此我試圖將這些行合並為一行,以獲得:
INTI0 13 .0 146.0 1898.0
最終的df
是:
order_creationdate orderid productid quantity prod_name price Amount
0 2021-01-18 22:27:03.341260 1 SnyTV 3.0 Sony LED TV 412.0 1236.0
1 2021-01-18 17:28:03.343089 1 AMDR5 1.0 AMD Ryzen 5 313.0 313.0
2 2021-01-18 13:19:03.343842 1 INTI0 13.0 Intel I10 146.0 1898.0
3 2021-01-18 12:29:03.344880 1 CMCFN 4.0 coolermaster CPU FAN 675.0 2700.0
我試過使用df.groupby
function:
df2['productid'] =df2['productid'].astype('str')
arr = np.sort(df2[['productid','quantity']], axis=1)
df2 = (df2.groupby([arr[:, 0],arr[:, 1]])
.agg({'price':'sum', 'Amount':'sum'})
.rename_axis(('X','Y'))
.reset_index())
print(df2)
但它會引發數據類型錯誤
File "/home/anti/Documents/db/create_rec.py", line 65, in <module>
arr = np.sort(df2[['productid','quantity']], axis=1)
File "<__array_function__ internals>", line 5, in sort
File "/home/anti/.local/lib/python3.8/site-packages/numpy/core/fromnumeric.py", line 991, in sort
a.sort(axis=axis, kind=kind, order=order)
TypeError: '<' not supported between instances of 'float' and 'str'
嘗試:
df2 = df2.groupby('productid').agg({'quantity':'sum','Amount':'sum'}).reset_index()
df2.groupby(['productid', 'orderid'], as_index=False).agg(
{'quantity': sum, 'Amount': sum, 'order_creationdate': min, 'prod_name': min, 'price': min}
)
output 是:
productid orderid quantity Amount order_creationdate prod_name price
0 AMDR5 1 1.0 313.0 2021-01-18 17:28:03.343089 AMD Ryzen 5 313.0
1 CMCFN 1 4.0 2700.0 2021-01-18 12:29:03.344880 coolermaster CPU FAN 675.0
2 INTI0 1 13.0 1898.0 2021-01-18 10:24:03.344399 Intel I10 146.0
3 SnyTV 1 3.0 1236.0 2021-01-18 22:27:03.341260 Sony LED TV 412.0
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.