[英]Assigning City Name by Latitude/Longitude values in Pandas Dataframe
我有这个数据框:
userId latitude longitude dateTime
0 121165 30.314368 76.384381 2018-02-01 00:01:57
1 95592 13.186810 77.643769 2018-02-01 00:02:17
2 111435 28.512889 77.088154 2018-02-01 00:04:02
3 129532 9.828420 76.310357 2018-02-01 00:06:03
4 95592 13.121986 77.610539 2018-02-01 00:08:54
我想创建一个新的dataframe列,例如:
userId latitude longitude dateTime city
0 121165 30.314368 76.384381 2018-02-01 00:01:57 Bengaluru
1 95592 13.186810 77.643769 2018-02-01 00:02:17 Delhi
2 111435 28.512889 77.088154 2018-02-01 00:04:02 Mumbai
3 129532 9.828420 76.310357 2018-02-01 00:06:03 Chennai
4 95592 13.121986 77.610539 2018-02-01 00:08:54 Delhi
这是此处给出的代码:
from urllib2 import urlopen
import json
def getplace(lat, lon):
url = "http://maps.googleapis.com/maps/api/geocode/json?"
url += "latlng=%s,%s&sensor=false" % (lat, lon)
v = urlopen(url).read()
j = json.loads(v)
components = j['results'][0]['address_components']
country = town = None
for c in components:
if "country" in c['types']:
country = c['long_name']
if "postal_town" in c['types']:
town = c['long_name']
return town, country
for i,j in df['latitude'], df['longitude']:
getplace(i, j)
我在这个地方出错:
components = j['results'][0]['address_components']
列表索引超出范围
我输入了英国的其他一些经度值,并且得出了结果,但不适用于印度各州。
所以现在我想尝试这样的事情:
if i,j in zip(range(79,80),range(83,84)):
df['City']='Bengaluru'
elif i,j in zip(range(13,14),range(70,71)):
df['City']='Delhi'
等等。 那么如何使用经度和纬度值以更可行的方式分配城市?
您使用的代码段来自2013年; Google API已更改,并且'postal_town'
不再可用。
您可以使用以下代码,该代码利用了requests
库并在没有返回结果的情况下设置了保护措施。
In [48]: def location(lat, long):
...: url = 'http://maps.googleapis.com/maps/api/geocode/json?latlng={0},{1}&sensor=false'.format(lat, long)
...: r = requests.get(url)
...: r_json = r.json()
...: if len(r_json['results']) < 1: return None, None
...: res = r_json['results'][0]['address_components']
...: country = next((c['long_name'] for c in res if 'country' in c['types']), None)
...: locality = next((c['long_name'] for c in res if 'locality' in c['types']), None)
...: return locality, country
...:
In [49]: location(28.512889, 77.088154)
Out[49]: ('Gurugram', 'India')
此函数搜索'locality'
,实际上对于DataFrame
的第二行不返回任何内容。 您可以通过检查结果来选择想要的字段(这是lat
, long
值为30.314368, 76.384381
)
[{'long_name': 'Udyog Vihar',
'short_name': 'Udyog Vihar',
'types': ['political', 'sublocality', 'sublocality_level_2']},
{'long_name': 'Kapas Hera Estate',
'short_name': 'Kapas Hera Estate',
'types': ['political', 'sublocality', 'sublocality_level_1']},
{'long_name': 'Gurugram',
'short_name': 'Gurugram',
'types': ['locality', 'political']},
{'long_name': 'Gurgaon',
'short_name': 'Gurgaon',
'types': ['administrative_area_level_2', 'political']},
{'long_name': 'Haryana',
'short_name': 'HR',
'types': ['administrative_area_level_1', 'political']},
{'long_name': 'India', 'short_name': 'IN', 'types': ['country', 'political']},
{'long_name': '122016', 'short_name': '122016', 'types': ['postal_code']}]
要将其应用于DataFrame
,您可以像这样使用numpy
的vectorize
(请记住,第二行将不返回任何内容)
In [71]: import numpy as np
In [72]: df['locality'] = np.vectorize(location)(df['latitude'], df['longitude'])
In [73]: df
Out[73]:
userId latitude longitude dateTime locality
0 121165 30.314368 76.384381 2018-02-01 00:01:57 Patiala
1 95592 13.186810 77.643769 2018-02-01 00:02:17 None
2 111435 28.512889 77.088154 2018-02-01 00:04:02 Gurugram
3 129532 9.828420 76.310357 2018-02-01 00:06:03 Ezhupunna
4 95592 13.121986 77.610539 2018-02-01 00:08:54 Bengaluru
PS我注意到所需输出的城市位置不正确。
PPS您还应注意,这可能需要一些时间,因为该函数每次需要查询API
您还可以创建范围更广的定位功能,但是它会非常粗糙,并且可能覆盖的区域太广。 然后,您可以按照之前显示的相同方式使用该功能
In [21]: def location(lat, long):
...: if 9 <= lat < 10 and 76 <= long < 77:
...: return 'Chennai'
...: elif 13 <= lat < 14 and 77 <= long < 78:
...: return 'Dehli'
...: elif 28 <= lat < 29 and 77 <= long < 78:
...: return 'Mumbai'
...: elif 30 <= lat < 31 and 76 <= long < 77:
...: return 'Bengaluru'
...:
In [22]: df['city'] = np.vectorize(location)(df['latitude'], df['longitude'])
In [23]: df
Out[23]:
userId latitude longitude dateTime city
0 121165 30.314368 76.384381 2018-02-01 00:01:57 Bengaluru
1 95592 13.186810 77.643769 2018-02-01 00:02:17 Dehli
2 111435 28.512889 77.088154 2018-02-01 00:04:02 Mumbai
3 129532 9.828420 76.310357 2018-02-01 00:06:03 Chennai
4 95592 13.121986 77.610539 2018-02-01 00:08:54 Dehli
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.