简体   繁体   English

在熊猫数据框中通过纬度/经度值分配城市名称

[英]Assigning City Name by Latitude/Longitude values in Pandas Dataframe

I have this data frame: 我有这个数据框:

    userId      latitude    longitude        dateTime
0   121165      30.314368   76.384381   2018-02-01 00:01:57
1   95592       13.186810   77.643769   2018-02-01 00:02:17
2   111435      28.512889   77.088154   2018-02-01 00:04:02
3   129532      9.828420    76.310357   2018-02-01 00:06:03
4   95592       13.121986   77.610539   2018-02-01 00:08:54

I want to create a new dataframe column like: 我想创建一个新的dataframe列,例如:

     userId  latitude   longitude    dateTime              city
0   121165  30.314368   76.384381   2018-02-01   00:01:57  Bengaluru
1   95592   13.186810   77.643769   2018-02-01   00:02:17  Delhi
2   111435  28.512889   77.088154   2018-02-01   00:04:02  Mumbai
3   129532  9.828420    76.310357   2018-02-01   00:06:03  Chennai
4   95592   13.121986   77.610539   2018-02-01   00:08:54  Delhi

I saw this code here , but its not working out. 我在这里看到了这段代码 ,但是没有奏效。

This is the code given there: 这是此处给出的代码:

from urllib2 import urlopen
import json
def getplace(lat, lon):
    url = "http://maps.googleapis.com/maps/api/geocode/json?"
    url += "latlng=%s,%s&sensor=false" % (lat, lon)
    v = urlopen(url).read()
    j = json.loads(v)
    components = j['results'][0]['address_components']
    country = town = None
    for c in components:
        if "country" in c['types']:
            country = c['long_name']
        if "postal_town" in c['types']:
            town = c['long_name']
    return town, country
for i,j in df['latitude'], df['longitude']:
    getplace(i, j)

I get error at this place: 我在这个地方出错:

components = j['results'][0]['address_components']

list index out of range 列表索引超出范围

I put some other latitude longitude values of UK and it worked out, but not for Indian states. 我输入了英国的其他一些经度值,并且得出了结果,但不适用于印度各州。

So now I want to try out something like this: 所以现在我想尝试这样的事情:

if i,j in zip(range(79,80),range(83,84)):
    df['City']='Bengaluru'
elif i,j in zip(range(13,14),range(70,71)):
    df['City']='Delhi'

and so on. 等等。 So how can I assign city in a more feasible manner using latitude and longitude values? 那么如何使用经度和纬度值以更可行的方式分配城市?

The code snippet that you are using was from 2013; 您使用的代码段来自2013年; the Google API has changed and 'postal_town' is no longer available. Google API已更改,并且'postal_town'不再可用。

You can use the following code which takes advantage of the requests library and places a guard in the case of no results being returned. 您可以使用以下代码,该代码利用了requests库并在没有返回结果的情况下设置了保护措施。

In [48]: def location(lat, long):
    ...:     url = 'http://maps.googleapis.com/maps/api/geocode/json?latlng={0},{1}&sensor=false'.format(lat, long)
    ...:     r = requests.get(url)
    ...:     r_json = r.json()
    ...:     if len(r_json['results']) < 1: return None, None
    ...:     res = r_json['results'][0]['address_components']
    ...:     country  = next((c['long_name'] for c in res if 'country' in c['types']), None)
    ...:     locality = next((c['long_name'] for c in res if 'locality' in c['types']), None)
    ...:     return locality, country
    ...:

In [49]: location(28.512889, 77.088154)
Out[49]: ('Gurugram', 'India')

This function searches 'locality' and actually doesn't return anything for the 2nd row of the DataFrame . 此函数搜索'locality' ,实际上对于DataFrame的第二行不返回任何内容。 You can choose what fields you want by inspecting the results (this is with a lat , long value of 30.314368, 76.384381 ) 您可以通过检查结果来选择想要的字段(这是latlong值为30.314368, 76.384381

[{'long_name': 'Udyog Vihar',
  'short_name': 'Udyog Vihar',
  'types': ['political', 'sublocality', 'sublocality_level_2']},
 {'long_name': 'Kapas Hera Estate',
  'short_name': 'Kapas Hera Estate',
  'types': ['political', 'sublocality', 'sublocality_level_1']},
 {'long_name': 'Gurugram',
  'short_name': 'Gurugram',
  'types': ['locality', 'political']},
 {'long_name': 'Gurgaon',
  'short_name': 'Gurgaon',
  'types': ['administrative_area_level_2', 'political']},
 {'long_name': 'Haryana',
  'short_name': 'HR',
  'types': ['administrative_area_level_1', 'political']},
 {'long_name': 'India', 'short_name': 'IN', 'types': ['country', 'political']},
 {'long_name': '122016', 'short_name': '122016', 'types': ['postal_code']}]

To apply this to your DataFrame , you can use numpy 's vectorize like so (remember that the second row won't return anything) 要将其应用于DataFrame ,您可以像这样使用numpyvectorize (请记住,第二行将不返回任何内容)

In [71]: import numpy as np

In [72]: df['locality'] = np.vectorize(location)(df['latitude'], df['longitude'])

In [73]: df
Out[73]:
   userId   latitude  longitude             dateTime   locality
0  121165  30.314368  76.384381  2018-02-01 00:01:57    Patiala
1   95592  13.186810  77.643769  2018-02-01 00:02:17       None
2  111435  28.512889  77.088154  2018-02-01 00:04:02   Gurugram
3  129532   9.828420  76.310357  2018-02-01 00:06:03  Ezhupunna
4   95592  13.121986  77.610539  2018-02-01 00:08:54  Bengaluru

PS I noted that the city locations of the desired output aren't correct. PS我注意到所需输出的城市位置不正确。

PPS You should also note that this may take some time as the function needs to query the API every time PPS您还应注意,这可能需要一些时间,因为该函数每次需要查询API

You can also create the location function with broader ranges but it will be very crude and you might cover too wide an area. 您还可以创建范围更广的定位功能,但是它会非常粗糙,并且可能覆盖的区域太广。 You can then use the function in the same way as previously shown 然后,您可以按照之前显示的相同方式使用该功能

In [21]: def location(lat, long):
    ...:     if 9 <= lat < 10 and 76 <= long < 77:
    ...:         return 'Chennai'
    ...:     elif 13 <= lat < 14 and 77 <= long < 78:
    ...:         return 'Dehli'
    ...:     elif 28 <= lat < 29 and 77 <= long < 78:
    ...:         return 'Mumbai'
    ...:     elif 30 <= lat < 31 and 76 <= long < 77:
    ...:         return 'Bengaluru'
    ...:     

In [22]: df['city'] = np.vectorize(location)(df['latitude'], df['longitude'])

In [23]: df
Out[23]: 
   userId   latitude  longitude             dateTime       city
0  121165  30.314368  76.384381  2018-02-01 00:01:57  Bengaluru
1   95592  13.186810  77.643769  2018-02-01 00:02:17      Dehli
2  111435  28.512889  77.088154  2018-02-01 00:04:02     Mumbai
3  129532   9.828420  76.310357  2018-02-01 00:06:03    Chennai
4   95592  13.121986  77.610539  2018-02-01 00:08:54      Dehli

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM