I have the following problem, and I'm unable to find the right solution.
I have a dataframe.
I need to pass the first_name to another dataframe, only if the following conditions are met:
condition #1 --> If 'id' is 1, then pass 'first_name' ONLY IF 'country' = US AND 'code' = 1 AND 'zip' = 3
condition #2 --> If 'id' is 2, then pass 'first_name' IF 'country' = US. (No need to check for code and zip. Pass first_name irrespective of code and zip)
So, in this dataframe, as per the conditions stated above, it needs to pass only - 'peter', 'mike' and 'jenny'
My code looks like:
filter1 = df['id']=='1'
filter2 = df['country'] ==1
filter3 = df['code']=='1'
filter4 = df['zip'] =='3'
#filtering data
df.where(filter1 & filter2 & filter3 & filter4, inplace = True)
#then pass first_name
new_df['first_name'] = df['first_name']
But by doing this I'm only able to apply either condition (1) or (2).
Please help. Thank you!
Use boolean indexing
with |
for chain filters by bitwise OR
with filter by column first_name
in DataFrame.loc
:
#if numbers are not strings remove `''`
filter1 = df['id']==1
filter2 = df['country'] == 'US'
filter3 = df['code']==1
filter4 = df['zip'] ==3
filter5 = df['id']==2
s = df.loc[(filter1 & filter2 & filter3 & filter4) | (filter5 & filter2), 'first_name']
print (s)
1 peter
3 mike
4 jenny
Name: first_name, dtype: object
Use .loc
with a combination of boolean masks.
new_df = df.loc[
( # mask1
df.id.eq(1) & df.country.eq('US') & df.code.eq(1) & df.zip.eq(3)
)
| # or
( # mask2
df.id.eq(2) & df.country.eq('US')
),
'first_name'
]
If you use .where
instead of .loc
with the same boolean masks you will get a dataframe of the same shape as df
but every row that is masked as False
will be filled with NaN
.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.