繁体   English   中英

将物理地址转换为地理位置纬度和经度

[英]Convert physical addresses to Geographic locations Latitude and Longitude

我已阅读CSV 文件(包含客户地址)并将数据分配到 DataFrame 表中。

csv 文件的描述(或 DataFrame 表)

DataFrame 包含多行5列

数据库示例

 Address1             Address3 Post_Code   City_Name                           Full_Address
 10000009    37 RUE DE LA GARE    L-7535      MERSCH       37 RUE DE LA GARE,L-7535, MERSCH
 10000009    37 RUE DE LA GARE    L-7535      MERSCH       37 RUE DE LA GARE,L-7535, MERSCH
 10000009    37 RUE DE LA GARE    L-7535      MERSCH       37 RUE DE LA GARE,L-7535, MERSCH
 10001998  RUE EDWARD STEICHEN    L-1855  LUXEMBOURG  RUE EDWARD STEICHEN,L-1855,LUXEMBOURG
 11000051       9 RUE DU BRILL    L-3898       FOETZ           9 RUE DU BRILL,L-3898 ,FOETZ

我编写了一个代码(Geocode with Python) ,以便将物理地址转换为地理位置→纬度和经度,但代码一直显示几个错误

到目前为止,我已经编写了这段代码:

代码是

import pandas as pd
from geopy.geocoders import Nominatim
from geopy.extra.rate_limiter import RateLimiter

# Read the CSV, by the way the csv file contains 43 columns
ERP_Data = pd.read_csv("test.csv")  

# Extracting the address information into a new DataFrame
Address_info= ERP_Data[['Address1','Address3','Post_Code','City_Name']].copy()

# Adding a new column called (Full_Address) that concatenate address columns into one 
# for example   Karlaplan 13,115 20,STOCKHOLM,Stockholms län, Sweden
Address_info['Full_Address'] = Address_info[Address_info.columns[1:]].apply(
lambda x: ','.join(x.dropna().astype(str)), axis=1)

locator = Nominatim(user_agent="myGeocoder")  # holds the Geocoding service, Nominatim

# 1 - conveneint function to delay between geocoding calls
geocode = RateLimiter(locator.geocode, min_delay_seconds=1) 

# 2- create location column
Address_info['location'] = Address_info['Full_Address'].apply(geocode)

# 3 - create longitude, laatitude and altitude from location column (returns tuple)
Address_info['point'] = Address_info['location'].apply(lambda loc: tuple(loc.point) if loc else None)
# 4 - split point column into latitude, longitude and altitude columns
Address_info[['latitude', 'longitude', 'altitude']] =   pd.DataFrame(Address_info['point'].tolist(), index=Address_info.index)

# using Folium to map out the points we created

folium_map = folium.Map(location=[49.61167,6.13], zoom_start=12,)

完整 output 错误的示例是:

RateLimiter caught an error, retrying (0/2 tries). Called with (*('44 AVENUE JOHN FITZGERALD KENNEDY,L-1855,LUXEMBOURG',), **{}).
Traceback (most recent call last):
  File "e:\Anaconda3\lib\urllib\request.py", line 1317, in do_open
    encode_chunked=req.has_header('Transfer-encoding'))
  File "e:\Anaconda3\lib\http\client.py", line 1244, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "e:\Anaconda3\lib\http\client.py", line 1290, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "e:\Anaconda3\lib\http\client.py", line 1239, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "e:\Anaconda3\lib\http\client.py", line 1026, in _send_output
    self.send(msg)
  File "e:\Anaconda3\lib\http\client.py", line 966, in send
    self.connect()
  File "e:\Anaconda3\lib\http\client.py", line 1414, in connect
    server_hostname=server_hostname)
  File "e:\Anaconda3\lib\ssl.py", line 423, in wrap_socket
    session=session
  File "e:\Anaconda3\lib\ssl.py", line 870, in _create
    self.do_handshake()
  File "e:\Anaconda3\lib\ssl.py", line 1139, in do_handshake
    self._sslobj.do_handshake()
socket.timeout: _ssl.c:1059: The handshake operation timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "e:\Anaconda3\lib\site-packages\geopy\geocoders\base.py", line 355, in _call_geocoder
    page = requester(req, timeout=timeout, **kwargs)
  File "e:\Anaconda3\lib\urllib\request.py", line 525, in open
    response = self._open(req, data)
  File "e:\Anaconda3\lib\urllib\request.py", line 543, in _open
    '_open', req)
  File "e:\Anaconda3\lib\urllib\request.py", line 503, in _call_chain
    result = func(*args)
  File "e:\Anaconda3\lib\urllib\request.py", line 1360, in https_open
    context=self._context, check_hostname=self._check_hostname)
  File "e:\Anaconda3\lib\urllib\request.py", line 1319, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error _ssl.c:1059: The handshake operation timed out>

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "e:\Anaconda3\lib\site-packages\geopy\extra\rate_limiter.py", line 126, in __call__
    return self.func(*args, **kwargs)
  File "e:\Anaconda3\lib\site-packages\geopy\geocoders\osm.py", line 387, in geocode
    self._call_geocoder(url, timeout=timeout), exactly_one
  File "e:\Anaconda3\lib\site-packages\geopy\geocoders\base.py", line 378, in _call_geocoder
    raise GeocoderTimedOut('Service timed out')
geopy.exc.GeocoderTimedOut: Service timed out

预期 output 是

    Address1      Address3        Post_Code   City_Name     Full_Address                      Latitude              Longitude  
    10000009    37 RUE DE LA GARE  L-7535     MERSCH       37 RUE DE LA GARE,L-7535, MERSCH  49.7508296                6.1085476                                                                            
    10000009    37 RUE DE LA GARE  L-7535     MERSCH       37 RUE DE LA GARE,L-7535, MERSCH  49.7508296                6.1085476
    10000009    37 RUE DE LA GARE  L-7535     MERSCH       37 RUE DE LA GARE,L-7535, MERSCH  49.7508296                6.1085476                                            
    10001998    RUE EDWARD STEICHEN L-1855  LUXEMBOURG   RUE EDWARD STEICHEN,L-1855,LUXEMBOURG 49.6302147              6.1713374                                        
    11000051    9 RUE DU BRILL      L-3898   FOETZ       9 RUE DU BRILL,L-3898 ,FOETZ         49.5217917               6.0101385

我已经更新了你的代码:

  • 补充: Address_info = Address_info.apply(lambda x: x.str.strip(), axis=1)
    • 删除str前后的空格
  • 添加了带有try-except的 function 来处理查找
from geopy.exc import GeocoderTimedOut, GeocoderQuotaExceeded
import time

ERP_Data = pd.read_csv("test.csv") 

# Extracting the address information into a new DataFrame
Address_info= ERP_Data[['Address1','Address3','Post_Code','City_Name']].copy()

# Clean existing whitespace from the ends of the strings
Address_info = Address_info.apply(lambda x: x.str.strip(), axis=1)  # ← added

# Adding a new column called (Full_Address) that concatenate address columns into one 
# for example   Karlaplan 13,115 20,STOCKHOLM,Stockholms län, Sweden
Address_info['Full_Address'] = Address_info[Address_info.columns[1:]].apply(lambda x: ','.join(x.dropna().astype(str)), axis=1)

locator = Nominatim(user_agent="myGeocoder")  # holds the Geocoding service, Nominatim

# 1 - convenient function to delay between geocoding calls
# geocode = RateLimiter(locator.geocode, min_delay_seconds=1)

def geocode_me(location):
    time.sleep(1.1)
    try:
        return locator.geocode(location)
    except (GeocoderTimedOut, GeocoderQuotaExceeded) as e:
        if GeocoderQuotaExceeded:
            print(e)
        else:
            print(f'Location not found: {e}')
            return None

# 2- create location column
Address_info['location'] = Address_info['Full_Address'].apply(lambda x: geocode_me(x))  # ← note the change here

# 3 - create longitude, latitude and altitude from location column (returns tuple)
Address_info['point'] = Address_info['location'].apply(lambda loc: tuple(loc.point) if loc else None)

# 4 - split point column into latitude, longitude and altitude columns
Address_info[['latitude', 'longitude', 'altitude']] =   pd.DataFrame(Address_info['point'].tolist(), index=Address_info.index)

Output:

 Address1                Address3 Post_Code   City_Name                             Full_Address                                                                                                                                    location                         point   latitude  longitude  altitude
 10000009       37 RUE DE LA GARE    L-7535      MERSCH          37 RUE DE LA GARE,L-7535,MERSCH                                                          (Rue de la Gare, Mersch, Canton Mersch, 7535, Lëtzebuerg, (49.7508296, 6.1085476))  (49.7508296, 6.1085476, 0.0)  49.750830   6.108548       0.0
 10000009       37 RUE DE LA GARE    L-7535      MERSCH          37 RUE DE LA GARE,L-7535,MERSCH                                                          (Rue de la Gare, Mersch, Canton Mersch, 7535, Lëtzebuerg, (49.7508296, 6.1085476))  (49.7508296, 6.1085476, 0.0)  49.750830   6.108548       0.0
 10000009       37 RUE DE LA GARE    L-7535      MERSCH          37 RUE DE LA GARE,L-7535,MERSCH                                                          (Rue de la Gare, Mersch, Canton Mersch, 7535, Lëtzebuerg, (49.7508296, 6.1085476))  (49.7508296, 6.1085476, 0.0)  49.750830   6.108548       0.0
 10001998     RUE EDWARD STEICHEN    L-1855  LUXEMBOURG    RUE EDWARD STEICHEN,L-1855,LUXEMBOURG  (Rue Edward Steichen, Grünewald, Weimershof, Neudorf-Weimershof, Luxembourg, Canton Luxembourg, 2540, Lëtzebuerg, (49.6302147, 6.1713374))  (49.6302147, 6.1713374, 0.0)  49.630215   6.171337       0.0
 11000051          9 RUE DU BRILL    L-3898       FOETZ              9 RUE DU BRILL,L-3898,FOETZ                                             (Rue du Brill, Mondercange, Canton Esch-sur-Alzette, 3898, Luxembourg, (49.5217917, 6.0101385))  (49.5217917, 6.0101385, 0.0)  49.521792   6.010139       0.0
 10000052  3 RUE DU PUITS  ROMAIN    L-8070   BERTRANGE  3 RUE DU PUITS  ROMAIN,L-8070,BERTRANGE                              (Rue du Puits Romain, Z.A. Bourmicht, Bertrange, Canton Luxembourg, 8070, Lëtzebuerg, (49.6084531, 6.0771901))  (49.6084531, 6.0771901, 0.0)  49.608453   6.077190       0.0

注意和其他资源:

  • output 在 TraceBack 中包含导致错误的地址
    • RateLimiter caught an error, retrying (0/2 tries). Called with (*('3 RUE DU PUITS ROMAIN,L-8070,BERTRANGE ',)
    • 请注意地址中所有额外的空格。 我添加了一行代码来删除字符串开头和结尾的空格
  • GeocoderTimedOut,真的很痛苦吗?
  • Geopy:捕获超时错误

最后:

  • 最终结果是服务超时,因为HTTP Error 429: Too Many Requests for the day。
  • 查看Nominatim 使用政策
  • 建议:使用不同的地理编码器

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM