简体   繁体   中英

Check if a Python dataframe contains string in list

I have a list and a dataframe with one column named Description that looks like this:

my_list = ['dog','cat','bird'...]

df
    |         Description           |
    |three_legged_dog0named1_Charlie|
    |          catis_mean           |
    |   1hippo_stepped-on_an_ant    |

I want to write a for loop that loops through each row in df and check whether it contains an element in list, if it does, print the element.

normally I'd use search(), but I don't know how it works with a list. I could write a for loop that captures all the cases but I don't want to do that. Is there another way around?

for i in df['Description']:
    if i is in my_list:
         print('the element that is in i')
    else:
         print('not in list')

the output should be:

dog 
cat
not in list

If want use pandas non loop method for test is used Series.str.findall with Series.str.join for all mateched values joined by , and last Series.replace empty strings:

my_list = ['dog','cat','bird']

df['new'] = (df['Description'].str.findall('|'.join(my_list))
                              .str.join(',')
                              .replace('','not in list'))
print (df)
                       Description          new
0  three_legged_dog0named1_Charlie          dog
1                       catis_mean          cat
2         1hippo_stepped-on_an_ant  not in list

pd.Series.str.replace

pattern = f'^.*({"|".join(my_list)}).*$'

# Create a mask to rid ourselves of the pesky no matches later
mask = df.Description.str.match(pattern)

# where the magic happens, use `r'\1'` to swap in the thing that matched
df.Description.str.replace(pattern, r'\1', regex=True).where(mask, 'not in list')

0            dog
1            cat
2    not in list
Name: Description, dtype: object

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM