[英]How to change the criteria of a condition based on values in a column - Pandas
I have the following problem, and I'm unable to find the right solution.我有以下问题,我无法找到正确的解决方案。
I have a dataframe.我有一个数据框。
I need to pass the first_name to another dataframe, only if the following conditions are met:仅当满足以下条件时,我才需要将 first_name 传递给另一个数据帧:
condition #1 --> If 'id' is 1, then pass 'first_name' ONLY IF 'country' = US AND 'code' = 1 AND 'zip' = 3条件 #1 --> 如果 'id' 为 1,则仅当 'country' = US AND 'code' = 1 AND 'zip' = 3 时传递 'first_name'
condition #2 --> If 'id' is 2, then pass 'first_name' IF 'country' = US.条件 #2 --> 如果 'id' 为 2,则传递 'first_name' IF 'country' = US. (No need to check for code and zip. Pass first_name irrespective of code and zip)
(无需检查代码和邮编。无论代码和邮编如何,都通过 first_name)
So, in this dataframe, as per the conditions stated above, it needs to pass only - 'peter', 'mike' and 'jenny'因此,在此数据框中,根据上述条件,它只需要传递 - 'peter'、'mike' 和 'jenny'
My code looks like:我的代码看起来像:
filter1 = df['id']=='1'
filter2 = df['country'] ==1
filter3 = df['code']=='1'
filter4 = df['zip'] =='3'
#filtering data
df.where(filter1 & filter2 & filter3 & filter4, inplace = True)
#then pass first_name
new_df['first_name'] = df['first_name']
But by doing this I'm only able to apply either condition (1) or (2).但是通过这样做,我只能应用条件 (1) 或 (2)。
Please help.请帮忙。 Thank you!
谢谢!
Use boolean indexing
with |
使用
boolean indexing
与|
for chain filters by bitwise OR
with filter by column first_name
in DataFrame.loc
:对于
bitwise OR
链式过滤器,在DataFrame.loc
按列first_name
过滤:
#if numbers are not strings remove `''`
filter1 = df['id']==1
filter2 = df['country'] == 'US'
filter3 = df['code']==1
filter4 = df['zip'] ==3
filter5 = df['id']==2
s = df.loc[(filter1 & filter2 & filter3 & filter4) | (filter5 & filter2), 'first_name']
print (s)
1 peter
3 mike
4 jenny
Name: first_name, dtype: object
Use .loc
with a combination of boolean masks.将
.loc
与布尔掩码组合使用。
new_df = df.loc[
( # mask1
df.id.eq(1) & df.country.eq('US') & df.code.eq(1) & df.zip.eq(3)
)
| # or
( # mask2
df.id.eq(2) & df.country.eq('US')
),
'first_name'
]
If you use .where
instead of .loc
with the same boolean masks you will get a dataframe of the same shape as df
but every row that is masked as False
will be filled with NaN
.如果您使用
.where
而不是.loc
与相同的布尔掩码,您将获得与df
相同形状的数据帧,但被掩码为False
每一行都将填充NaN
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.