[英]Aggregate Pandas DataFrame with condition using NamedAgg
I have an orders table with column order_state.我有一个带有 order_state 列的订单表。 And need to count orders for each order state, grouped by hour, but not using group by order_state column.
并且需要计算每个订单 state 的订单,按小时分组,但不使用按 order_state 列分组。 And I want to use NamedAgg.
我想使用 NamedAgg。 Is it possible?
可能吗? Something like this:
像这样的东西:
orders_agg = orders.groupby(
by=[pandas.Grouper(key='created_at', freq='H'), 'source']
).agg(
orders_count=pandas.NamedAgg('created_at', 'count'),
finished_orders_count=pandas.NamedAgg('order_state', lambda x: (x == 'finished').count()),
cancelled_orders_count=pandas.NamedAgg('order_state', lambda x: (x == 'offer_cancelled').count())
).reset_index().rename(columns={'created_at': 'datetime_msk'})
And result should be:结果应该是:
But now i get total orders count in each column.
但现在我得到每列的总订单数。
I think you need change .count()
to .sum()
for count True
s values:我认为您需要将 .count(
.count()
更改为.sum()
以获得 count True
的值:
orders_agg = orders.groupby(
by=[pandas.Grouper(key='created_at', freq='H'), 'source']
).agg(
orders_count=pandas.NamedAgg('created_at', 'count'),
finished_orders_count=pandas.NamedAgg('order_state', lambda x: x == 'finished').sum(),
cancelled_orders_count=pandas.NamedAgg('order_state', lambda x: (x == 'offer_cancelled').sum())
).reset_index().rename(columns={'created_at': 'datetime_msk'})
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.