使用 NamedAgg 聚合 Pandas DataFrame 条件

Question

I have an orders table with column order_state.我有一个带有 order_state 列的订单表。 And need to count orders for each order state, grouped by hour, but not using group by order_state column.并且需要计算每个订单 state 的订单，按小时分组，但不使用按 order_state 列分组。 And I want to use NamedAgg.我想使用 NamedAgg。 Is it possible?可能吗？ Something like this:像这样的东西：

orders_agg = orders.groupby(
    by=[pandas.Grouper(key='created_at', freq='H'), 'source']
).agg(
    orders_count=pandas.NamedAgg('created_at', 'count'),
    finished_orders_count=pandas.NamedAgg('order_state', lambda x: (x == 'finished').count()),
    cancelled_orders_count=pandas.NamedAgg('order_state', lambda x: (x == 'offer_cancelled').count())
).reset_index().rename(columns={'created_at': 'datetime_msk'})

And result should be:结果应该是： But now i get total orders count in each column.但现在我得到每列的总订单数。

Answer 1

I think you need change .count() to .sum() for count True s values:我认为您需要将 .count( .count()更改为.sum()以获得 count True的值：

orders_agg = orders.groupby(
by=[pandas.Grouper(key='created_at', freq='H'), 'source']
).agg(
orders_count=pandas.NamedAgg('created_at', 'count'),
finished_orders_count=pandas.NamedAgg('order_state', lambda x: x == 'finished').sum(),
    cancelled_orders_count=pandas.NamedAgg('order_state', lambda x: (x == 'offer_cancelled').sum())
).reset_index().rename(columns={'created_at': 'datetime_msk'})

使用 NamedAgg 聚合 Pandas DataFrame 条件

问题描述

1 个解决方案

解决方案1
0 已采纳 2020-12-22 13:02:54

使用 NamedAgg 聚合 Pandas DataFrame 条件

问题描述

1 个解决方案

解决方案1 0 已采纳 2020-12-22 13:02:54

解决方案1
0 已采纳 2020-12-22 13:02:54