[英]Calculate Km with latitude and longitude of different DataFrames Python Pandas
I have 4 Dataframes (ticket_data.csv, providers.csv, stations.csv and cities.csv)我有 4 个数据帧(ticket_data.csv、providers.csv、station.csv 和 city.csv)
In stations.csv I have 2 colls called o_city (origin city) and d_city (destination city) those two colls gives me the id of the city i need to look for in cities.csv在station.csv 中,我有 2 个名为 o_city(始发城市)和 d_city(目的地城市)的 colls,这两个 colls 给了我我需要在 citys.csv 中查找的城市的 id
In cities.csv I have the lat and long of each city.在citys.csv 中,我有每个城市的纬度和经度。
How can i calculate the distance between o_city and d_city for each ticket ?如何计算每张票的 o_city 和 d_city 之间的距离? I tried to use pyproj but I didn't find a way to make it work with each ticket..
我尝试使用 pyproj,但我没有找到一种方法来使其与每张票一起使用。
Screenshot of csv files : csv文件截图:
Welcome to StackOverflow!欢迎使用 StackOverflow! In your cities dataframe, assuming here it is called
city_df ;
在您的城市数据框中,假设这里称为
city_df ; for each row you can use something called the haversine distance formula from Euclidean geometry to calculate the distance between two coordinate pairs on Earth's surface.
对于每一行,您可以使用欧几里得几何中称为半正弦距离公式的东西来计算地球表面上两个坐标对之间的距离。 Here is an example of some dummy Python3 code of roughly how you may go about this (just using two pairs of coordinates for ease of communication):
以下是一些虚拟 Python3 代码的示例,大致说明了您可能会如何处理此问题(仅使用两对坐标以方便通信):
from haversine import haversine
distance = haversine((city_df[origin_lat][0], city_df[origin_lon][0]), (city_df[destination_lat][0], city_df[destination_lon][0]))
The coordinates must be in decimal degree notation as in 43.9202 instead of 43* 38" 67' notation. Given this, the output value of distance will be in km units.坐标必须采用十进制度数表示法,如43.9202而不是 43* 38" 67' 表示法。鉴于此,距离的输出值将以公里为单位。
Hope this helps you get closer to solving your problem!希望这可以帮助您更接近解决您的问题!
PS - you may need to install haversine, as it is not in the standard libary PS - 您可能需要安装haversine,因为它不在标准库中
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.