keep match in pandas Dataframe column and remove the rest

Question

I have a list called names

names = ['kramer hickok', 'carlos ortiz ', 'talor gooch', 'mikumu horikawa', 'yoshinori fujimoto']

In addition, I have a pandas.DataFrame called page . The dataframe looks as follows:

     name
--   ---------------------------
0    kramer hickok united states   
1    carlos ortiz mexico  
2    talor gooch united states    
3    mikumu horikawa japan
4    yoshinori fujimoto japan

I want to replace all the countries from the column. How can I do this as fast as possible?

The desired output:

     name
--   ---------------------------
0    kramer hickok  
1    carlos ortiz 
2    talor gooch 
3    mikumu horikawa 
4    yoshinori fujimoto

I tried the following with no result:

for name in names:
   page['name'] = page['name'].str.extract(name)

Thank you

Answer 1

You can try .str.extract

page['out'] = page['name'].str.extract(r'\b(' + '|'.join(names) + r')\b')

print(page)

                          name                 out
0  kramer hickok united states       kramer hickok
1          carlos ortiz mexico        carlos ortiz
2    talor gooch united states         talor gooch
3        mikumu horikawa japan     mikumu horikawa
4     yoshinori fujimoto japan  yoshinori fujimoto
5  mikumumikumu horikawa japan                 NaN

Answer 2

How about just replacing the column at all?

page['name'] = names

I think it would take less time and much easier to handle.

( ※ Note that there should be no duplicate in the names if using this code.)

Answer 3

If every name is just two words, you don't even need your list:

df.name = df.name.str.extract('(\w+ \w+)')
print(df)
# Output:
                 name
0       kramer hickok
1        carlos ortiz
2         talor gooch
3     mikumu horikawa
4  yoshinori fujimoto

keep match in pandas Dataframe column and remove the rest

Question

3 answers

solution1
1 2022-08-09 02:55:49

solution2
0 2022-08-09 02:54:10

solution3
0 2022-08-09 03:09:27

keep match in pandas Dataframe column and remove the rest

Question

3 answers

solution1 1 2022-08-09 02:55:49

solution2 0 2022-08-09 02:54:10

solution3 0 2022-08-09 03:09:27

solution1
1 2022-08-09 02:55:49

solution2
0 2022-08-09 02:54:10

solution3
0 2022-08-09 03:09:27