[英]Column aggregates filtered by row values with pandas DataFrame
有更好(更快)的方法嗎?
我想在某一天找到與該人當天在同一地點的總銷售額:
day name sold place
0 mon Ben 2 1
1 mon Amy 6 0
2 mon Sue 7 1
3 mon John 9 0
4 tues Ben 9 1
5 tues Amy 4 0
6 tues Sue 10 1
7 tues John 5 0
8 wed Ben 8 0
9 wed Amy 3 0
10 wed Sue 10 1
11 wed John 3 0
結果如下所示:
day name sold place sold_at_same_place
0 mon Ben 2 1 9
1 mon Amy 6 0 15
2 mon Sue 7 1 9
3 mon John 9 0 15
4 tues Ben 9 1 19
5 tues Amy 4 0 9
6 tues Sue 10 1 19
7 tues John 5 0 9
8 wed Ben 8 0 14
9 wed Amy 3 0 14
10 wed Sue 10 1 10
11 wed John 3 0 14
如果不清楚,星期一在1 place
sold
的總量是2 + 7 = 9。 因為本是地方之一,他sold_in_same_place
是9艾米的周一sold_at_same_place
是15,因為她在place
0。
這就是我想出的:
獲取每個地方價值的每日總數:
def sold_by_day_filter(df, col_name, field_value): """ sums sold by day filtering the `col_name` on `field_value` """ subset = pd.DataFrame(df[df[col_name] == field_value]) aggregated_subset = pd.DataFrame( {str(field_value): subset.groupby(['day'])['sold'].sum()} ).reset_index() return aggregated_subset
將每個人加入原始數據集:
for val in df['place'].unique(): df = pd.merge(df, sold_by_day_filter(df,'place', val), on='day')
現在數據集看起來像這樣:
day name sold place 1 0 0 mon Ben 2 1 9 15 1 mon Amy 6 0 9 15 2 mon Sue 7 1 9 15 3 mon John 9 0 9 15 4 tues Ben 9 1 19 9 5 tues Amy 4 0 19 9 6 tues Sue 10 1 19 9 7 tues John 5 0 19 9 8 wed Ben 8 0 10 14 9 wed Amy 3 0 10 14 10 wed Sue 10 1 10 14 11 wed John 3 0 10 14
值應用於sold_at_same_place
在價值柱基place
:
df['sold_at_same_place'] = \\ df.apply( lambda row: row[str(row['place'])], axis = 1)
刪除臨時列值('1'和'0'):
fields_to_drop = [str(field) for field in df['place'].unique()] df.drop(fields_to_drop, axis=1, inplace=True)
所以這很有效,但我覺得可能有一些簡單的方法可以用Pandas做到這一點。 任何建議表示贊賞!
我認為這是一個使用transform
:
>>> df["sold_at_same_place"] = df.groupby(["day", "place"])["sold"].transform(sum)
>>> df
day name sold place sold_at_same_place
0 mon Ben 2 1 9
1 mon Amy 6 0 15
2 mon Sue 7 1 9
3 mon John 9 0 15
4 tues Ben 9 1 19
5 tues Amy 4 0 9
6 tues Sue 10 1 19
7 tues John 5 0 9
8 wed Ben 8 0 14
9 wed Amy 3 0 14
10 wed Sue 10 1 10
11 wed John 3 0 14
transform
獲取groupby結果並將結果廣播回原始索引。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.