I try to use any() to check if the column contains any string from the list and make a new column with the corresponding results
df_data = pd.DataFrame({'A':[2,1,3], 'animals': ['cat, frog', 'kitten, fish', 'frog2, fish']})
cats = ['kitten', 'cat']
df_data['cats'] = df_data.apply(lambda row: True if any(item in cats for item in row['animals']) else False, axis = 1)
I got these results, and I don't understand why it is False for the first two rows :
A animals cats
0 2 cat, frog False
1 1 kitten, fish False
2 3 frog2, fish False
I expect to get False for the last row only
With pandas you should try your best not using for loop or apply , I am using DataFrame
constructor with isin
and any
df_data['cats']=pd.DataFrame(df_data.animals.str.split(', ').tolist()).isin(cats).any(1)
df_data
A animals cats
0 2 cat, frog True
1 1 kitten, fish True
2 3 frog2, fish False
Flip your iterables
df_data['cats'] = df_data.apply(lambda row: True if any([item in row['animals'] for item in cats]) else False, axis = 1)
print(df_data)
# A animals cats
# 0 2 cat, frog True
# 1 1 kitten, fish True
# 2 3 frog2, fish False
If you look closely
item in row['animals'] for item in cats
will iterate over cats
and see if the item is in row['animals']
item in cats for item in row['animals']
will iterate over row['animals']
and see if the value of row['animals']
is in the cats list
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.