![](/img/trans.png)
[英]How to split a dataframe in multiple dataframes after a specific column value is found
[英]Split Dataframe into multiple dataframes after a specific value
我正在尝试根据特定值将以下格式的数据帧拆分为多个数据帧。
Column0 Column1 Column2 Column3
Question Answer Reason 30
It is received? XXX YYY 27
Deducted FDF RES 64
Transferred? WWW RRR 64
Transport Services Passgener Carrier 30
Distance KKK WDF 27
Return PPP LMN 64
在上面的数据框中,我想将从Event2或Code = 30(特定颜色代码或标题代码)开始的行拆分为单独的数据帧,并将其余(以上)的行拆分为其他数据帧(也可能有两个以上的事件)。 。
我尝试了很少的代码,但大多数都是出于过滤目的。
预期输出为:Dataframe1:
Question Answer Reason 30
It is received? XXX YYY 27
Deducted FDF RES 64
Transferred? WWW RRR 64
Dataframe2:
Transport Services Passgener Carrier 30
Distance KKK WDF 27
Return PPP LMN 64
请帮忙,因为我是python新手。
你可以groupby
上的临时助手column
基于Code
分离出不同的DataFrames
并将其添加到dictionary
。 我假设您的意思是数据实际上看起来像这样以匹配原始架构:
Question Answer Reason Code
0 It is received? XXX YYY 30
1 Deducted FDF RES 64
2 Transferred? WWW RRR 64
3 Transport Services Passgener Carrier 30
4 Distance KKK WDF 27
5 Return PPP LMN 64
如果是这样,您可以执行以下操作:
df['tmp'] = df.apply(lambda x: x.Question if x.Code == 30 else np.nan, axis=1).fillna(method='ffill')
要得到:
Question Answer Reason Code tmp
0 It is received? XXX YYY 30 It is received?
1 Deducted FDF RES 64 It is received?
2 Transferred? WWW RRR 64 It is received?
3 Transport Services Passgener Carrier 30 Transport Services
4 Distance KKK WDF 27 Transport Services
5 Return PPP LMN 64 Transport Services
在这里,您可以enumerate
tmp
列中的groups
,并将结果添加到带有integer
keys
的dictionary
:
questions = {}
for e, (event, data) in enumerate(df.groupby('tmp')):
questions[e] = data.drop('tmp', axis=1)
questions[0]
Question Answer Reason Code
0 It is received? XXX YYY 30
1 Deducted FDF RES 64
2 Transferred? WWW RRR 64
questions[1]
Question Answer Reason Code
3 Transport Services Passgener Carrier 30
4 Distance KKK WDF 27
5 Return PPP LMN 64
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.