使用groupby遍歷pandas DataFrame，並根據每個組中的條件選擇值

Question

我有一個包含許多組的大型DataFrame。 我想要做的是遍歷每個組，並根據是否滿足特定條件來總結該組的值。

我的DataFrame看起來像這樣：

 Item_Num   Price_Change   Unit_Sales
 10                 True           10
 10                 False          15
 10                 False          11
 10                 False          13
 12                 True           10 
 12                 False          11
 12                 False          14
 12                 True           11
 12                 False          11

對於每組Item_Num，我要記錄從該行開始直到出現另一個價格變化時的單位銷售額總和。 所以，我想要這樣的結果：

 0 Item_Num   Price_Change   Unit_Sales  Sum 
 1 10                 True           10   49
 2 10                 False          15  
 3 10                 False          11
 4 10                 False          13
 5 12                 True           10   34
 6 12                 False          11
 7 12                 False          14
 8 12                 True           11   22
 9 12                 False          11

（因此，我通過將行1到4相加得到49的總和，通過將行5-7相加得到34的總和，通過將行8和9相加得到22的總和）。

這是我到目前為止（素描）的內容：

 for name, group in new.groupby('UPC'):
     if ['Price_Change'] == True:
          sum(unit_sales until next price change)

遍歷每個組的最佳方法是什么（可以改進我的方法），如何選擇Price_Change == True的行？

Answer 1

非常接近您之前的問題：-)

df['New']=df.groupby([df['Item_Num'],df['Price_Change'].cumsum()])['Unit_Sales'].transform('sum')
df
Out[15]: 
   Item_Num  Price_Change  Unit_Sales  New
0        10          True          10   49
1        10         False          15   49
2        10         False          11   49
3        10         False          13   49
4        12          True          10   35
5        12         False          11   35
6        12         False          14   35
7        12          True          11   22
8        12         False          11   22
df.New=df.New.where(df['Price_Change'],'')
df
Out[17]: 
   Item_Num  Price_Change  Unit_Sales New
0        10          True          10  49
1        10         False          15    
2        10         False          11    
3        10         False          13    
4        12          True          10  35
5        12         False          11    
6        12         False          14    
7        12          True          11  22
8        12         False          11

使用groupby遍歷pandas DataFrame，並根據每個組中的條件選擇值

問題描述

1 個解決方案

解決方案1
1 2018-01-29 21:58:49

使用groupby遍歷pandas DataFrame，並根據每個組中的條件選擇值

問題描述

1 個解決方案

解決方案1 1 2018-01-29 21:58:49

解決方案1
1 2018-01-29 21:58:49