[英]Conditional sum from rows into a new column in pandas
I am looking to create a new column in panda based on the value in the row. 我正在根据行中的值在panda中创建一个新列。 My sample data: 我的样本数据:
df=pd.DataFrame({"A":['a','a','a','a','a','a','b','b','b'],
"Sales":[2,3,7,1,4,3,5,6,9,10,11,8,7,13,14],
"Week":[1,2,3,4,5,11,1,2,3,4])
I want a new column "Last3WeekSales" corresponding to each week, having the sum of sales for the previous 3 weeks. 我想要一个对应于每周的新列“ Last3WeekSales”,其中包含前3周的销售总额。
NOTE: Shift() won't work here as data for some weeks is missing. 注意:Shift()在这里无法使用,因为缺少了数周的数据。
Logic which I thought: Checking the week no. 我认为的逻辑:检查星期数。 in each row, then summing up the data from w-1, w-2, w-3. 在每一行中,然后对w-1,w-2,w-3中的数据求和。
Output required: 需要的输出:
A Week Last3WeekSales
0 a 1 0
1 a 2 2
2 a 3 5
3 a 4 12
4 a 5 11
5 a 11 0
6 b 1 0
7 b 2 5
8 b 3 11
9 b 4 20
Use groupby
, shift
and rolling
: 使用groupby
, shift
和rolling
:
df['Last3WeekSales'] = df.groupby('A')['Sales']\
.apply(lambda x: x.shift(1)
.rolling(3, min_periods=1)
.sum())\
.fillna(0)
Output: 输出:
A Sales Week Last3WeekSales
0 a 2 1 0.0
1 a 3 2 2.0
2 a 7 3 5.0
3 a 1 4 12.0
4 a 4 5 11.0
5 a 3 6 12.0
6 b 5 1 0.0
7 b 6 2 5.0
8 b 9 3 11.0
you can use pandas.rolling_sum
to sum over 3 last values, and shift(n)
to shift your column by n times (1 in your case). 您可以使用pandas.rolling_sum
对3个最后的值求和,并使用shift(n)
将列移动n次(在您的情况下为1个)。
if we suppose you a column 'sales' with the sales of each week, the code would be : 如果我们假设您的“销售额”列包含每周的销售额,则代码为:
df["Last3WeekSales"] = df.groupby("A")["sales"].apply(lambda x: pd.rolling_sum(x.shoft(1),3))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.