[英]Calculate the number of points in a given radius by latitude and longitude
I have a dataframe of points with its id-name and latitude/longitude:我有一个 dataframe 点及其 ID 名称和纬度/经度:
df = pd.DataFrame({'id':list('abcde'),'latitude': [38.470628, 37.994155, 38.66937, 34.119578, 36.292307],'longitude': [-121.404586, -121.802341, -121.295325, -117.413791, -119.804074]}) #sample
For each id I need to count the number of points (of the same dataset) that are located within a radius of 2 miles from it.对于每个 id,我需要计算位于其 2 英里半径范围内的点数(同一数据集)。
Question: how to do this in the simplest way in Python?问题:如何在Python中以最简单的方式做到这一点?
The question is somewhat ambiguous.这个问题有点模棱两可。 The first component you need is a function to calculate distance between two coordinates, this requires some trigonometry and has several implementations in the following questions .
您需要的第一个组件是 function 来计算两个坐标之间的距离,这需要一些三角函数并且在以下问题中有几个实现。
After you have the function simply loop over all points and calculate.获得 function 后,只需遍历所有点并进行计算。 There might be more efficient ways than two nested loop but this is the simplest.
可能有比两个嵌套循环更有效的方法,但这是最简单的。
import numpy as np
import pandas as pd
from sklearn.neighbors import BallTree
Sample Data样本数据
df = pd.DataFrame({'id':list('abcde'),'latitude': [38.470628, 37.994155, 38.66937, 34.119578, 36.292307],'longitude': [-121.404586, -121.802341, -121.295325, -117.413791, -119.804074]}) #sample
Extract lat,long and convert to radians.提取经纬度并转换为弧度。 Calculate the needed radius when converted to unit sphere.
计算转换为单位球体时所需的半径。
coords = df[["latitude","longitude"]]
distance_in_miles = 50
earth_radius_in_miles = 3958.8
radius = distance_in_miles / earth_radius_in_miles
tree = BallTree( np.radians(coords), leaf_size=10, metric='haversine')
tree.query_radius( np.radians(coords), r=radius, count_only=True)
Which gives array([3, 2, 2, 1, 1])
给出
array([3, 2, 2, 1, 1])
If you want to return the indici and use them for aggregates;如果您想返回标记并将它们用于聚合; one way is to
一种方法是
df = pd.DataFrame({'id':list('abcde'),'latitude': [38.470628, 37.994155, 38.66937, 34.119578, 36.292307],'longitude': [-121.404586, -121.802341, -121.295325, -117.413791, -119.804074], 'saleprice_usd_per_sqf': [200, 300, 700, 350, 50]})
coords = df[["latitude","longitude"]]
distance_in_miles = 50
earth_radius_in_miles = 3958.8
radius = distance_in_miles / earth_radius_in_miles
Note we use indici here and not only count;请注意,我们在这里使用 indici 而不仅仅是计数;
tree = BallTree( np.radians(coords), leaf_size=10, metric='haversine')
indici = tree.query_radius( np.radians(coords), r=radius, count_only=False)
And use list comprehension to for instance get the median value for each radius.并使用列表理解来获取每个半径的中值。 Be aware the the point itself is always included in its own radius.
请注意,点本身始终包含在它自己的半径中。
[np.median(df.saleprice_usd_per_sqf.values[idx]) for idx in indici]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.