简体   繁体   中英

check if a columns contains any str from list

I try to use any() to check if the column contains any string from the list and make a new column with the corresponding results

df_data = pd.DataFrame({'A':[2,1,3], 'animals': ['cat, frog', 'kitten, fish', 'frog2, fish']})
cats = ['kitten', 'cat']
df_data['cats'] = df_data.apply(lambda row: True if any(item in cats for item in row['animals']) else False, axis = 1)

I got these results, and I don't understand why it is False for the first two rows :

   A       animals   cats
0  2     cat, frog  False
1  1  kitten, fish  False
2  3   frog2, fish  False

I expect to get False for the last row only

With pandas you should try your best not using for loop or apply , I am using DataFrame constructor with isin and any

df_data['cats']=pd.DataFrame(df_data.animals.str.split(', ').tolist()).isin(cats).any(1)
df_data
   A       animals   cats
0  2     cat, frog   True
1  1  kitten, fish   True
2  3   frog2, fish  False

Flip your iterables

df_data['cats'] = df_data.apply(lambda row: True if any([item in row['animals'] for item in cats]) else False, axis = 1)

print(df_data)
#    A       animals   cats
# 0  2     cat, frog   True
# 1  1  kitten, fish   True
# 2  3   frog2, fish  False

If you look closely

item in row['animals'] for item in cats

will iterate over cats and see if the item is in row['animals']

item in cats for item in row['animals']

will iterate over row['animals'] and see if the value of row['animals'] is in the cats list

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM