During data entry, Some of the States were added to the same cell as the Address line. The city and state vary and are generally unknown. There are also some cases of a ,
that would need to be removed.
AddrLines AddrCity AddrState
0 123 street Titusville FL NaN
1 456 road Viera FL NaN
2 789 place Melbourne, Fl NaN
3 001 ave Wright VA
My goal is to clean up the City column while moving the state over to the State column. Something like this, but removing the state and the ,
at the same time.
df.loc[(df['AddrCity'].str.contains(' Fl')),'AddrState'] = 'FL'
df.loc[(df['AddrCity'].str.contains(' FL')),'AddrState'] = 'FL'
df.loc[(df['AddrCity'].str.contains(', Fl')),'AddrState'] = 'FL'
I was able to get this by performing the following
df1[['AddrCity', 'State_Holder']] = df1['AddrCity'].str.replace(', ', ' ').str.replace(' ', ', ').str.split(', ', 1, expand = True)
df1['AddrState'] = np.where(df1['AddrState'].isna(), df1['State_Holder'], df1['AddrState'])
df1.drop(columns = ['State_Holder'])
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.