![](/img/trans.png)
[英]How to split a dataframe in multiple dataframes after a specific column value is found
[英]Split Dataframe into multiple dataframes after a specific value
我正在嘗試根據特定值將以下格式的數據幀拆分為多個數據幀。
Column0 Column1 Column2 Column3
Question Answer Reason 30
It is received? XXX YYY 27
Deducted FDF RES 64
Transferred? WWW RRR 64
Transport Services Passgener Carrier 30
Distance KKK WDF 27
Return PPP LMN 64
在上面的數據框中,我想將從Event2或Code = 30(特定顏色代碼或標題代碼)開始的行拆分為單獨的數據幀,並將其余(以上)的行拆分為其他數據幀(也可能有兩個以上的事件)。 。
我嘗試了很少的代碼,但大多數都是出於過濾目的。
預期輸出為:Dataframe1:
Question Answer Reason 30
It is received? XXX YYY 27
Deducted FDF RES 64
Transferred? WWW RRR 64
Dataframe2:
Transport Services Passgener Carrier 30
Distance KKK WDF 27
Return PPP LMN 64
請幫忙,因為我是python新手。
你可以groupby
上的臨時助手column
基於Code
分離出不同的DataFrames
並將其添加到dictionary
。 我假設您的意思是數據實際上看起來像這樣以匹配原始架構:
Question Answer Reason Code
0 It is received? XXX YYY 30
1 Deducted FDF RES 64
2 Transferred? WWW RRR 64
3 Transport Services Passgener Carrier 30
4 Distance KKK WDF 27
5 Return PPP LMN 64
如果是這樣,您可以執行以下操作:
df['tmp'] = df.apply(lambda x: x.Question if x.Code == 30 else np.nan, axis=1).fillna(method='ffill')
要得到:
Question Answer Reason Code tmp
0 It is received? XXX YYY 30 It is received?
1 Deducted FDF RES 64 It is received?
2 Transferred? WWW RRR 64 It is received?
3 Transport Services Passgener Carrier 30 Transport Services
4 Distance KKK WDF 27 Transport Services
5 Return PPP LMN 64 Transport Services
在這里,您可以enumerate
tmp
列中的groups
,並將結果添加到帶有integer
keys
的dictionary
:
questions = {}
for e, (event, data) in enumerate(df.groupby('tmp')):
questions[e] = data.drop('tmp', axis=1)
questions[0]
Question Answer Reason Code
0 It is received? XXX YYY 30
1 Deducted FDF RES 64
2 Transferred? WWW RRR 64
questions[1]
Question Answer Reason Code
3 Transport Services Passgener Carrier 30
4 Distance KKK WDF 27
5 Return PPP LMN 64
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.