I am using Python 3.7 and have a large data frame that includes a column of addresses, like so:
c = ['420 rodeo drive', '22 alaska avenue', '919 franklin boulevard', '39 emblem alley', '67 fair way']
df1 = pd.DataFrame(c, columns = ["address"])
There is also a dataframe of address abbrevations and their usps version. An example:
x = ['avenue', 'drive', 'boulevard', 'alley', 'way']
y = ['aly', 'dr', 'blvd', 'ave', 'way']
df2 = pd.DataFrame(list(zip(x,y)), columns = ['common_abbrev', 'usps_abbrev'], dtype = "str")
What I would like to do is as follows:
df1['address']
for occurrences of words in df2['common_abbrev']
and replace with df2['usps_abbrev']
.df1
. To this end I have tried in a few ways what appears to be the canonical strategy of df.str.replace()
like so:
df1["address"] = df1["address"].str.replace(df2["common_abbrev"], df2["usps_abbrev"])
However, I get the following error:
`TypeError: repl must be a string or callable`.
My question is:
Since I've already given the dtype
for df2
, why am I receiving the error?
How can I produce my desired result of:
address 0 420 rodeo dr 1 22 alaska ave 2 919 franklin blvd 3 39 emblem aly 4 67 fair way
Thanks for your help.
First create a dict
using zip
. Then use Series.replace
:
In [703]: x = ['avenue', 'drive', 'boulevard', 'alley', 'way']
...: y = ['aly', 'dr', 'blvd', 'ave', 'way']
In [686]: d = dict(zip(x, y))
In [691]: df1.address = df1.address.replace(d, regex=True)
In [692]: df1
Out[692]:
address
0 420 rodeo dr
1 22 alaska aly
2 919 franklin blvd
3 39 emblem ave
4 67 fair way
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.