Find words in a column per row after list of specific words in Python

Question

I have a pandas data frame with a column called warranty. It has record of ways to fix different issues for example. It looks something like the attached picture.

在此处输入图像描述

Goal is to find words after the words listed below.

word_list=['replace', 'clean', 'remove']

how can I get this expected output a column added to above df with values replace battery wire clean fuel tank remove nail

Answer 1

pandas can use regex to search string and you could use pattern to

(?:replace|clean|remove) (\w+)

You can use python to generate this pattern

words = "|".join(word_list)
pattern = f'(?:{words}) (\w+)'

print('pattern:', pattern)

And later

df['word'] = df['warranty'].str.lower().str.findall(pattern).str[0]

To make sure I convert text to lower() because pattern uses lower case words.

If replace,clean,remove is always as first word then you could simply split(" ") text and get second element:

df['word'] = df['warranty'].str.split(' ').str[1]

If you need more complex code then you could use .apply()

def function(text):
    # ... complex code ...
    return text.split(' ')[1]

df['word'] = df['warranty'].apply(function)

Minimal working code

import pandas as pd

data = {
    'warranty': [
        'replace battery wire from car',
        'clean fuel tank',
        'remove nail from tire',
    ], 
}

word_list=['replace', 'clean', 'remove']

df = pd.DataFrame(data)

words = "|".join(word_list)
pattern = f'(?:{words}) (\w+)'
print('pattern:', pattern)

def function(text):
    # ... complex code ...
    return text.split(' ')[1]

df['method1'] = df['warranty'].str.lower().str.findall(pattern).str[0]
df['method2'] = df['warranty'].str.split(' ').str[1]
df['method3'] = df['warranty'].apply(function)

print(df)

Result:

pattern: (?:replace|clean|remove) (\w+)

                        warranty  method1  method2  method3
0  replace battery wire from car  battery  battery  battery
1                clean fuel tank     fuel     fuel     fuel
2          remove nail from tire     nail     nail     nail

Find words in a column per row after list of specific words in Python

Question

1 answers

solution1
0 2021-12-14 11:42:19

Find words in a column per row after list of specific words in Python

Question

1 answers

solution1 0 2021-12-14 11:42:19

solution1
0 2021-12-14 11:42:19