复杂的 function 与 groupby 之间？ Python

Question

Here is a sample dataset.这是一个示例数据集。

import pandas as pd
import numpy as np
df = pd.DataFrame({ 
    'VipNo':np.repeat( range(3), 2 ),
    'Quantity': np.random.randint(200,size=6),
    'OrderDate': np.random.choice( pd.date_range('1/1/2020', periods=365, freq='D'), 6, replace=False)})
print(df)

So I have a couple of steps to do.所以我有几个步骤要做。 I want to create a new column named qtywithin1mon/totalqty.我想创建一个名为 qtywithin1mon/totalqty 的新列。 First I want to group the VipNo (each number represents an individual) because a person may have made multiple purchases.首先，我想对 VipNo（每个数字代表一个人）进行分组，因为一个人可能进行了多次购买。 Then I want to see if the orderdate is within a certain range (let's say 2020/03/01 - 2020/03/31).然后我想看看订单日期是否在某个范围内（比如 2020/03/01 - 2020/03/31）。 If so, I want to use the respective quantity on that day divided by the total quantity this customer purchased.如果是这样，我想使用当天各自的数量除以该客户购买的总数量。 My dataset is big so a customer may have ordered twice within the time range and I would want the sum of the two orders divided by the total quantity in this case.我的数据集很大，因此客户可能在该时间范围内订购了两次，在这种情况下，我希望将两次订单的总和除以总数量。 How can I achieve this goal?我怎样才能实现这个目标？ I really have no idea where to start..我真的不知道从哪里开始..

Thank you so much!太感谢了！

Answer 1

You can create a new column masking quantity within the given date range, then groupby:您可以在给定的日期范围内创建一个新的列屏蔽数量，然后 groupby：

start, end = pd.to_datetime(['2020/03/01','2020/03/31'])

(df.assign(QuantitySub=df['OrderDate'].between(start,end)*df.Quantity)
   .groupby('VipNo')[['Quantity','QuantitySub']]
   .sum()
   .assign(output=lambda x: x['QuantitySub']/x['Quantity'])
   .drop('QuantitySub', axis=1)
)

With a data frame:使用数据框：

   VipNo  Quantity  OrderDate
0      0       105 2020-01-07
1      0        56 2020-03-04
2      1       167 2020-09-05
3      1        18 2020-05-08
4      2       151 2020-11-01
5      2        14 2020-03-17

The output is: output 是：

       Quantity    output
VipNo            
0           161  0.347826
1           185  0.000000
2           165  0.084848

复杂的 function 与 groupby 之间？ Python

问题描述

1 个解决方案

解决方案1
0 2020-07-17 19:12:15

复杂的 function 与 groupby 之间？ Python

问题描述

1 个解决方案

解决方案1 0 2020-07-17 19:12:15

解决方案1
0 2020-07-17 19:12:15