简体   繁体   English

在mongodb中聚类地理数据

[英]Clustering geo data in mongodb

We have a mongodb database with >200K rows each containing a point location (lat,lng). 我们有一个拥有> 200K行的mongodb数据库,每个行包含一个点位置(lat,lng)。 We would like to create a query that specifies a geo point & radius and return a list of clusters. 我们想创建一个指定地理点和半径的查询,并返回一个簇列表。 Each cluster is basically an aggregation of locations that are near each other. 每个群集基本上是彼此靠近的位置的聚合。

First question: Is it possible for mongodb to automatically create and maintain these clusters for us? 第一个问题:mongodb是否可以为我们自动创建和维护这些集群? and if yes, how can we query mongodb to return clusters (not actual data points) for a specific geo-location. 如果是,我们如何查询mongodb以返回特定地理位置的聚类(而不是实际数据点)。 Each returned cluster would have a position and the number of actual data points (geo-tagged rows). 每个返回的集群都有一个位置和实际数据点的数量(地理标记的行)。 Basically, we would want it to return the equivalent of a k-means clustering algorithm. 基本上,我们希望它返回k-means聚类算法的等价物。

We've created a mongodb geoHaystack index that seems to cluster rows but not sure how we can use it to achieve the above query: 我们已经创建了一个mongodb geoHaystack索引,似乎是对行进行聚类,但不确定如何使用它来实现上述查询:

db.locations.createIndex( { 'position' : "geoHaystack", type : 1 } , { bucketSize : 1 }) db.locations.createIndex({'position':“geoHaystack”,type:1},{bucketSize:1})

Alternatively, we could dynamically use a clustering algorithm such as https://github.com/spember/geo-cluster to generate these clusters but I'm assuming this would be a very slow process. 或者,我们可以动态使用聚类算法(如https://github.com/spember/geo-cluster)来生成这些聚类,但我假设这将是一个非常缓慢的过程。

Any recommendations on how best to implement such a query? 有关如何最好地实现此类查询的任何建议?

In MongoDB, a geoHaystack index has another purpose - it is a special index that is optimized to return results over small areas. 在MongoDB中, geoHaystack索引有另一个目的 - 它是一个特殊的索引,经过优化可以在小区域内返回结果。 I think it can't be used here. 我认为不能在这里使用。

So, I think you can retrieve all points and do clustering using k-means. 所以,我认为你可以检索所有点并使用k-means进行聚类。 That should be fast. 那应该很快。 After that you can save them as another entities (eg Polygon) and use it anywhere you need. 之后,您可以将它们另存为其他实体(例如Polygon)并在任何需要的地方使用它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM