Pandas DataFrame - check if string in column A contains full word string in column B

Question

I have a dataframe with two columns foo which contains a string of text and bar which contains a search term string. For each row in my dataframe I want to check if the search term is in the text string with word boundaries .

For example

import pandas as pd
import numpy as np
import re

df = pd.DataFrame({'foo':["the dog is blue", "the cat isn't orange"], 'bar':['dog', 'cat is']})

df
      bar                   foo
0     dog       the dog is blue
1  cat is  the cat isn't orange

Essentially I want to vectorize the following operations

re.search(r"\bdog\b", "the dog is blue") is not None  # True
re.search(r"\bcat is\b", "the cat isn't orange") is not None  # False

What's a fast way to do this, considering I'm working with a few hundred thousand rows? I tried using the str.contains method but couldn't quite get it.

Answer 1

You can apply your function to each row:

df.apply(lambda x: re.search(r'\b' + x.bar + r'\b', x.foo) is not None, axis=1)

Result:

0     True
1    False
dtype: bool

Answer 2

df.apply(lambda x: re.search(r'\b{0}\b'.format(x.bar), x.foo) is not None, axis='columns')

df.apply将通用函数应用于pandas行或列，请参见此处： http ://pandas.pydata.org/pandas-docs/stable/genic/pandas.DataFrame.apply.html

Pandas DataFrame - check if string in column A contains full word string in column B

Question

2 answers

solution1
1 ACCPTED 2016-03-12 21:05:25

solution2
1 2016-03-12 21:06:34

Pandas DataFrame - check if string in column A contains full word string in column B

Question

2 answers

solution1 1 ACCPTED 2016-03-12 21:05:25

solution2 1 2016-03-12 21:06:34

solution1
1 ACCPTED 2016-03-12 21:05:25

solution2
1 2016-03-12 21:06:34