I have a lot of addresses in excel file. I have import it and store it in dataframe. Now I want to detect the states in each address and show it in new column. How do I loop for every row in my dataframe and add the value of the states in that row?
List of all states:
allstates=['SELANGOR','JOHOR','KELANTAN','MALACCA','NEGERI SEMBILAN','PAHANG','PENANG','PERAK','PERLIS',
'SABAH','SARAWAK','TERENGGANU','KUALA LUMPUR','K. LUMPUR','LABUAN','PUTRAJAYA']
and below is how I want my dataframe to be:
Address | States
-------------------------------------------------------
311 Jalan Springhill SELANGOR | *SELANGOR*
31 Jalan Segamat JOHOR | *JOHOR*
I want the states (example:SELANGOR) to insert in the states column
Try this:
df['States'] = df.Address.str.extract('({})'.format('|'.join(allstates)))
If you are certain (or want) that the state names appear only at the end of the addresses:
df['Sates'] = df.Address.str.extract('({})$'.format('|'.join(allstates)))
Output:
Address Sates
0 311 Jalan Springhill SELANGOR SELANGOR
1 31 Jalan Segamat JOHOR JOHOR
import pandas as pd
data = pd.read_csv('states.csv')
print(data)
Address
0 311 Jalan Springhill SELANGOR
1 31 Jalan Segamat JOHOR
for index, row in data.iterrows():
value = row.Address
State = value.split()[-1:][0]
data.loc[index,'State'] = State
print(data)
Address State
0 311 Jalan Springhill SELANGOR SELANGOR
1 31 Jalan Segamat JOHOR JOHOR
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.