简体   繁体   中英

How can I create a new column in a pandas data frame by extracting words from sentences in another column?

I have a pandas dataframe like this.

import pandas as pd
student_id = ['001', '002', '003', '004']
names = ['Jane', 'Mary', 'Andrew', 
'Paul']
address = ['7 karumu st Ikeja Lagos', '8 
logo street Umuahia Abia', 
       '10 jege close PH Rivers', '9 
Lekki gate Lagos']

test_1 = {'Student_ID': student_id, 
      'Name': names, 
      'Address': address}
df = pd.DataFrame(test_1)
df`

Output

and a list like this:

List = [Imo, Lagos, Abia, Ebonyi, Rivers]

So i am trying to iterate through the Address column and estract the states in the address which is also in the list. If a state in the list is spotted I would like to extract it and append to a new column called state.

I tried to use the iterrows() method but I am a bit lost

You can filter like this:

df = df[df['Address'].str.contains('|'.join(List))]
  • get the 'Adress' Column
  • convert to 'List' to DataFrame
  • After I think 'MERGE' you should use
  • Storage to last dafaFrame and add the as a another column

I think this will solve your problem

Assuming that the state is always the last word in the address.

import numpy as np

states = ["Imo", "Lagos", "Abia", "Ebonyi", "Rivers"]
df["State"] = df["Address"].map(lambda x: state if (state:=x.split()[-1]) in states else np.nan)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM