简体   繁体   English

PostGIS中的聚类点

[英]Cluster points in PostGIS

I'm building an application that pulls lat/long values from a database and plots them on a Google Map.我正在构建一个从数据库中提取经纬度值并将它们绘制在 Google 地图上的应用程序。 There could be thousands of data points so I "cluster" points close to each other so the user is not overwhelmed with icons.可能有数以千计的数据点,所以我“聚集”点彼此靠近,这样用户就不会被图标淹没。 At the moment I perform this clustering in the application, with a simple algorithm like this:目前我在应用程序中执行此聚类,使用如下简单算法:

  1. Get array of all points获取所有点的数组
  2. Pop first point off array弹出数组的第一个点
  3. Compare first point to all other points in array looking for ones that fall within x distance将第一个点与数组中的所有其他点进行比较,寻找落在 x 距离内的点
  4. Create a cluster with the original and close points.使用原始点和接近点创建一个集群。
  5. Remove close points from array从数组中删除接近点
  6. Repeat重复

Now I release this is inefficient and is the reason I have been looking into GIS systems.现在我发布这是低效的,这就是我一直在研究 GIS 系统的原因。 I have set up PostGIS and have my lat & longs stored in a POINT geometry object.我已经设置了 PostGIS 并将我的经纬度存储在 POINT 几何对象中。

Can someone get me started or point me to some resources on a simple implementation of this clustering algorithm in PostGIS?有人可以让我开始或向我指出有关 PostGIS 中此聚类算法的简单实现的一些资源吗?

I ended up using a combination of snaptogrid and avg .我最终使用了snaptogridavg的组合。 I realize there are algorithms out there (ie kmeans as Denis suggested) that will give me better clusters but for what I'm doing this is fast and accurate enough.我意识到有一些算法(即丹尼斯建议的 kmeans)可以为我提供更好的集群,但对于我正在做的事情,这已经足够快速和准确了。

An example of clustering lonlat points (of st_point type) with PostGIS.使用 PostGIS 对lonlat点( st_point类型)进行聚类的示例。 The result set will contain (cluster_id, id) pairs.结果集将包含 (cluster_id, id) 对。 The number of clusters is the argument passed to ST_ClusterKMeans .簇数是传递给ST_ClusterKMeans的参数。

WITH sparse_places AS (
  SELECT
    lonlat, id, COUNT(*) OVER() as count
  FROM places
) 
  SELECT
    sparse_places.id,
    ST_ClusterKMeans(lonlat::geometry, LEAST(count::integer, 10)) OVER() AS cid
  FROM sparse_places;

We need the Common Table Expression with a COUNT window function in order to make sure the number of clusters provided to ST_ClusterKMeans never goes below the number of input rows.我们需要带有COUNT窗口函数的公共表表达式,以确保提供给ST_ClusterKMeans的簇数永远不会低于输入行数。

If it's enough to have stuff clustered in your browser, you could easily make use of OpenLayer's clustering capabilities.如果在您的浏览器中群集内容就足够了,您可以轻松利用 OpenLayer 的群集功能。 There are 3 examples that show clustering.有 3 个示例显示了聚类。

I've used it with a PostGIS database before, and as long as you don't have ridiculous amounts of data, it works pretty smooth.我之前曾将它与 PostGIS 数据库一起使用,只要您没有大量数据,它就可以运行得非常流畅。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM