简体   繁体   中英

Reverse geocoding from coordinate columns on Pandas dataframe

I have a Pandas dataframe containing several columns of attributes describing various locations. Let us call the dataframe df, and the dataframe has one column location_tuple containing a tuple of (latitude, longitude) coordinates values for every row in df. I would now like to create a new column in df containing the city name for each of the locations in df, and wonder if anyone could suggest a convenient approach to do this.

I am able to get the city name of a given row in df using the geopy package. Importing Nominatim from geopy.geocoders and creating a geolocator object as geolocator = Nominatim(user_agent="myapp") , I can get the city name of the coordinates on row zero by typing

geolocator.reverse(df.location_tuple[0]).raw['address']['city']

but I find no way in implementing this to get a new column with city names for all rows in the dataframe. I would appreciate some help on this.

Many thanks in advance!

A lambda expression is what you need to describe the entire process of getting city from location_tuple .

lambda el: geolocator.reverse(el).raw["address"]["city"]

Inserting this into either list(map()) or df.apply() will work.

df["city"] = list(map(lambda el: geolocator.reverse(el).raw["address"]["city"], df["location_tuple"]))
df["city"] = df["location_tuple"].apply(lambda el: geolocator.reverse(el).raw["address"]["city"])

Code : (please provide sample data next time for the helper's convenience)

from geopy.geocoders import Nominatim
import pandas as pd

df = pd.DataFrame(data={
    "location_tuple": [
        (25.0330, 121.5654),  # Taipei
        (52.3676, 4.9041)  # Amsterdam
    ]
})

geolocator = Nominatim(user_agent="Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.116 Safari/537.36", timeout=3)

df["city"] = list(map(lambda el: geolocator.reverse(el).raw["address"]["city"], df["location_tuple"]))
# alternative
# df["city"] = df["location_tuple"].apply(lambda el: geolocator.reverse(el).raw["address"]["city"])

Output :

df
Out[16]: 
       location_tuple       city
0  (25.033, 121.5654)        臺北市
1   (52.3676, 4.9041)  Amsterdam

Also consider adding language="en" in geolocator.reverse() so the city name becomes English.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM