简体   繁体   English

使用 GeoPy 计算 pandas dataframe 上的纬度/经度距离

[英]Calculating distance of latitude/longitude on pandas dataframe using GeoPy

I'm trying to calculate distance between latitude and longitude using geopy on a Pandas Dataframe.我正在尝试使用 Pandas Dataframe 上的 geopy 计算纬度和经度之间的距离。

here is my dataframe:这是我的 dataframe:

    latitude    longitude   altitude
    -15.836310  -48.020298  1137.199951
    -15.836360  -48.020512  1136.400024
    -15.836415  -48.020582  1136.400024
    -15.836439  -48.020610  1136.400024
    -15.836488  -48.020628  1136.599976

I tried two different ways:我尝试了两种不同的方法:

from geopy import distance

for i in range(1, len(df)):
   before = (df.loc[i-1, 'latitude'], df.loc[i-1, 'longitude'])
   actual = (df.loc[i, 'latitude'], df.loc[i, 'longitude'])
   df.loc[i, 'geodesic'] = distance.distance(before, actual).miles

error:错误:

 KeyError: 0

Apparently, df.loc[i, 'column_name'] does not work.显然, df.loc[i, 'column_name']不起作用。

and:和:

from geopy import distance

df['geodesic'] = distance.distance((df.latitude.shift(1), df.longitude.shift(1)), (df.latitude, df.longitude)).miles

Error:错误:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Official GeoPy Documentation:官方 GeoPy 文档:

from geopy import distance
newport_ri = (41.49008, -71.312796)
cleveland_oh = (41.499498, -81.695391)
print(distance.distance(newport_ri, cleveland_oh).miles)

I got the error.我得到了错误。

1 - I had to check if latitude or longitude is NaN . 1 - 我必须检查latitudelongitude是否为NaN

2 - I couldn't set time as index. 2 - 我无法将time设置为索引。 (i don't know why, that's took a long time to discover) (我不知道为什么,这花了很长时间才发现)

Once checked this, the error was gone.一旦检查了这一点,错误就消失了。

raw = """latitude;longitude;altitude
-15.836310;-48.020298;1137.199951
-15.836360;-48.020512;1136.400024
-15.836415;-48.020582;1136.400024
-15.836439;-48.020610;1136.400024
-15.836488;-48.020628;1136.599976"""

import pandas as pd
from io import StringIO
from geopy import distance

data = StringIO(raw)
df = pd.read_csv(data, sep=";")
df1 = df.drop(['altitude'], axis=1)
locations = df1.apply(tuple, axis=1)

for counter in range(len(locations) - 1):
    print(distance.distance(locations[counter], locations[counter + 1]).miles)

from df = pd.read_csv(data, sep=";") it's the same as you have your code, i made it testable.df = pd.read_csv(data, sep=";")它与您的代码相同,我使其可测试。

After that, df1 = df.drop(['altitude'], axis=1) drop the table z axis, not needed in this application.之后, df1 = df.drop(['altitude'], axis=1)删除表格 z 轴,在此应用程序中不需要。

convert the df1 to tuples, and loop through locations and you got your distance将 df1 转换为元组,并循环遍历位置,你就得到了你的距离

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM