简体   繁体   English

如何执行经纬度数据点的聚类

[英]How to perform clustering of lat/lon data points

My preferred algorithm is DBSCAN in scikit-learn. 我首选的算法是scikit-learn中的DBSCAN。 I am not sure however if (and how) to incorporate the radius in addition to latitude and longitude that I use already. 但是,我不确定是否(以及如何)除了已经使用的经度和纬度之外还包括半径。 My second question in how to compute the centers of the new clusters. 我的第二个问题是如何计算新群集的中心。 Any ideas? 有任何想法吗?

DBSCAN uses an epsilon radius query. DBSCAN使用epsilon半径查询。 This is where you use latitude and longitue. 在这里您可以使用纬度和经度。

I don't know if scikit-learn allows you to use arbitrary distances though. 我不知道scikit-learn是否允许您使用任意距离。 I've seen a blog post on using OPTICS (the successor of DBSCAN) to cluster 23 million tweets using latitude and longitude; 我看过一篇有关使用OPTICS(DBSCAN的后继产品)使用纬度和经度来聚类2300万条推文的博客文章 but it used ELKI not scikit-learn. 但是它使用ELKI而不是scikit-learn。

DBSCAN doesn't use centroids. DBSCAN不使用质心。 So you don't need to compute them on the sphere at all. 因此,您根本不需要在球上进行计算。 In fact, centroids do not make sense for DBSCAN . 实际上, 质心对于​​DBSCAN没有意义 They may be outside the cluster, if it is not convex; 如果它们不是凸的,它们可能在簇之外 and DBSCAN can find non-convex clusters. DBSCAN可以找到非凸簇。 Consider a city with a lake in the center. 考虑一个以湖为中心的城市。 The centroid may be right in the lake. 重心可能正好在湖中。 Or a city in a bay - the centroid will be inside the bay then. 还是海湾中的城市-重心将在海湾内。 The centroid of the bay area (san francisco, Oakland, ...) probably is near treasure island... etc. 湾区(旧金山,奥克兰等)的质心可能在宝岛附近等。

one way to calculate centroid,is to make a sum of each each cluster's longitudes(sum of each point's longitude) and calculate the mean value,this will give you a rough longitude for your centroid.Do the same thing for latitude. 一种计算质心的方法是对每个群集的经度求和(每个点的经度之和)并计算平均值,这将为您的质心提供一个大致的经度。对纬度执行相同的操作。 Here is a good example that was fair in my point of viw 在我看来,这是一个很好的例子

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 减少拉特隆点 - reduce lat lon points 使用最大/最小纬度和经度以及网格点数,如何获得经纬度网格? - With max/min lat and lon and number of grid points, how to get lat/lon grid? 如何一次计算沿路径(纬度/经度点)的测地距离? - How to calculate geodesic distance along a path (lat/lon points) at once? 如何使用 GeoPandas 将点的 GeoSeries 转换为元组(经纬度)列表 - How to convert a GeoSeries of points to a list of tuples (lat, lon) using GeoPandas 如何在python中使用经纬度数据计算最小距离 - How to calculate minimum distance using lat-lon data in python 如何使用python在特定区域(纬度/经度)中绘制geotiff数据 - how to plot geotiff data in specific area (lat/lon) with python lat/lon to utm to lat/lon 非常有缺陷,怎么会? - lat/lon to utm to lat/lon is extremely flawed, how come? 如何找到经/纬/高点之间的3d距离? - How can I find 3d distance between lat/lon/alt points? 如何转换pandas中的lat / lon点并查看它们是否属于某些边界多边形? - How to convert lat/lon points in pandas and see whether they fall in some boundary polygons? 如何检查形状文件多边形包含lat和lon点的numpy meshgrid - how to check a shape file polygon contains numpy meshgrid of lat and lon points
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM