[英]Python Pandas checking for a value if it exists from one DataFrame to another DataFrame
I have an Excel file in which I use each columns as DataFrame. 我有一个Excel文件,其中每个列都用作DataFrame。 Here are the 5 DataFrames that I am using - ( I will be adding a row number column for easier clarification in my question. It is not included in the original file ): 这是我正在使用的5个数据框-( 为了方便澄清,我将添加一个行号列。它不包含在原始文件中 ):
row_no svc_no i_status caller_id f_status remarks
1 11111 WO 22222 WO
2 22222 WO 11111 WO
3 33333 WO n/a FA
4 NULL FA 33333 WO
5 444444 WO 55555 WO
6 55555 WO new_num WO
I need to put a value in remarks
column after I satisfy the conditions. 满足条件后,需要在“ remarks
列中输入一个值。 In this case there are 3 scenarios in which each scenario have a different conditions. 在这种情况下,存在3个场景,其中每个场景都有不同的条件。
Note: The rows of each data may vary but for my example, I put them consecutively 注意:每个数据的行可能会有所不同,但以我的示例为例,我将它们连续放置
Scenario 1 conditions (rows 1 and 2): 方案1条件(第1行和第2行):
svc_no
is not equal to caller_id
svc_no
不等于caller_id
svc_no
is not NULL svc_no
不为NULL caller_id
is not n/a caller_id
不是n / a svc_no
is in caller_id
and vice versa svc_no
在caller_id
,反之亦然 i_status
and f_status
is WO i_status
和f_status
是WO Scenario 2 conditions (rows 3 and 4): 方案2条件(第3行和第4行):
svc_no
is not equal to caller_id
svc_no
不等于caller_id
svc_no
is in caller_id
and vice versa svc_no
在caller_id
,反之亦然 svc_no
is matched with n/a while the value its pair in caller_id
is matched with NULL svc_no
的值与n / a匹配,而caller_id
其对的值与NULL匹配 i_status
and f_status
is FA if value is NULL or n/a 如果i_status
和f_status
为FA,则值为NULL或n / a Scenario 3 conditions (rows 5 and 6): 方案3的情况(第5行和第6行):
svc_no
is not equal to caller_id
svc_no
不等于caller_id
svc_no
is 6 numerical characters svc_no
是6个数字字符 caller_id
is new_num caller_id
为new_num i_status
and f_status
is WO i_status
和f_status
是WO svc_no
is in caller_id
and vice versa svc_no
在caller_id
,反之亦然 Now let's say I satisfy the conditions for each scenario, I will have to put a designated value in remarks. 现在说我满足每种情况的条件,我将不得不在备注中输入一个指定的值。 So my desired output would be: 所以我想要的输出将是:
row_no svc_no i_status caller_id f_status remarks
1 11111 WO 22222 WO S1 Transpose
2 22222 WO 11111 WO S1 Transpose
3 33333 WO n/a FA S2 Transpose
4 NULL FA 33333 WO S2 Transpose
5 444444 WO 55555 WO S3 Transpose
6 55555 WO new_num WO S3 Transpose
My problem is that even though my code is working and followed the conditions, the output is not accurate. 我的问题是,即使我的代码可以正常工作并且符合条件,输出也不准确。 Here is my code: 这是我的代码:
# Scenario 1
df.loc[(df['svc_no'] != df['caller_id']) &
(df['svc_no'].isin(df['caller_id'])) &
(df['caller_id'].isin(df['svc_no'])) &
(df['svc_no'] != 'NULL') &
(df['caller_id'] != 'n/a') &
(df['i_status'] == 'WO') &
(df['f_status'] == 'WO'), ['remarks']] = 'S1 Transpose'
# Scenario 2
# NULL svc_no
df.loc[(df['svc_no'] == 'NULL') &
(df['caller_id'] !='n/a') &
(df['svc_no'].isin(df['caller_id'])) &
(df['caller_id'].isin(df['svc_no'])) &
(df['i_status'] == 'FA') &
(df['f_status'] == 'WO')['remarks']] = 'S2 Transpose'
# n/a in caller_id
df.loc[(df['svc_no'] != 'NULL') &
(df['caller_id'] =='n/a') &
(df['svc_no'].isin(df['caller_id'])) &
(df['caller_id'].isin(df['svc_no'])) &
(df['i_status'] == 'WO') &
(df['f_status'] == 'FA')['remarks']] = 'S2 Transpose'
# Scenario 3
df.loc[(c_merge['svc_no'] != 'NULL') &
(df['svc_no'].isin(c_merge['caller_id'])) &
(df['caller_id'].isin(c_merge['svc_no'])) &
(df['i_status'] == 'WO') &
(df['caller_id'] != c_merge['svc_no']) &
(df['f_status'] == 'WO') &
(df['caller_id'] == 'new_num', ['remarks']] = s3_wo_wo
The output that I am having is: 我的输出是:
row_no svc_no i_status caller_id f_status remarks
1 11111 WO 22222 WO S1 Transpose
2 22222 WO 11111 WO S1 Transpose
3 33333 WO n/a FA S1 Transpose
4 NULL FA 33333 WO S1 Transpose
5 444444 WO 55555 WO
6 55555 WO new_num WO S3 Transpose
S1 Transpose
also input those in S2 Transpose
and S3 Transpose only
put input in one rows. S1 Transpose
也输入S2 Transpose
和S3 Transpose only
那些, S3 Transpose only
将输入成一行。
Is there a way in which I can group 2 rows that apply the conditions? 有没有一种方法可以将适用条件的2行分组? Or is there a way around for my code in which they will be applied to its specific rows affected? 还是有一种方法可以将我的代码应用于受影响的特定行?
I answered this by doing: I used df.loc for each row following. 我的回答是:我在下面的每一行中使用了df.loc。
I was able to achieve my output without grouping the two rows that contains the transposed value. 我能够实现输出而无需将包含转置值的两行分组。
In line (df['svc_no']isint(df['caller_id']))
, I locate if the value exists in svc_no
in caller_id
and created another df.loc for the other other row 在(df['svc_no']isint(df['caller_id']))
,我确定该值是否存在于svc_no
的caller_id
并为另一行创建了另一个df.loc
Scenario 1 : 场景1 :
df.loc[(c_merge['svc_no'] != 'NULL') & /
(df['i_status'] == 'WO') & /
(df['caller_id'] != 'n/a') & /
(df['f_status'] == 'WO') & /
(df['svc_no'] != df['caller_id']) & /
(df['svc_no'].isin(df['caller_id'])), ['remarks']] = 'S1 Transpose'
df.loc[(c_merge['svc_no'] != 'NULL') & /
(df['i_status'] == 'WO') & /
(df['caller_id'] != 'n/a') & /
(df['f_status'] == 'WO') & /
(df['svc_no'] != df['caller_id']) & /
(df['caller_id'].isin(df['svc_no'])), ['remarks']] = 'S1 Transpose'
I will apply this to the other scenarios as I think this is the way to it. 我将其应用于其他情况,因为我认为这是解决问题的方法。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.