![](/img/trans.png)
[英]Adding missing rows from one dataframe to another based on condition
[英]Adding rows based on condition - Dataframe
使用Series.isin
過濾多個值,聚合總和添加列location
並最后添加到原始 DataFrame 沒有匹配行的掩碼:
mask = df['location'].isin(["Reply's Area - New Commercial Area", 'Cultural Hub'])
df1 = (df[mask].groupby(['day','locationTypes'],as_index=False)[['dwell', 'football']]
.sum()
.assign(location = 'Stage Area')
.reindex(df.columns, axis=1))
df = pd.concat([df[~mask], df1], ignore_index=True)
Jezrael 看起來他很接近答案,但也許足球的聚合不正確......僅僅從他的代碼來看,所以我可能是錯的。
正確的版本看起來像這樣,這與您在示例中建議的數字相匹配。 我制作了一個較小版本的示例表用於測試。 這里的“數據”是您的數據框。
mask = data["location"].isin(["Repley's Area - New Commercial Area", "Cultural Hub"])
data[mask].groupby(["day","locationTypes"], as_index=False)['dwell', 'football'].sum().assign(location="Stage Area")
輸出:
day locationTypes dwell football location
0 2020-11-11 Zone 145 2307 Stage Area
1 2020-11-12 Zone 95 2905 Stage Area
感謝您的回復! 以下工作:
mask=df[df['location'].isin(["Repley's Area - New Commercial Area",'Cultural Hub'])]
df1=mask.groupby(['day','locationTypes'],as_index=False)['footfall','dwell (minutes)'].sum().assign(location='Stage Area')
#reordering the columns for pd.concat
df1= df1[df.columns]
df_final=pd.concat([df[~df['location'].isin(["Repley's Area - New Commercial Area",'Cultural Hub'])],df1])
#checking the result
df_final[(df_final['day']=='2020-11-11') & (df_final['location']=='Stage Area')]
#這使
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.