简体   繁体   中英

How to group objects based on longitude/latitude proximity using laravel/php

I have a group of users. The user count could be 50 or could be 2000. Each should have a long/lat that I have retrieved from Google Geo api.

I need to query them all, and group them by proximity and a certain count. Say the count is 12 and I have 120 users in the group. I want to group people by how close they are (long/lat) to other people. So that I wind up with 10 groups of people who are close in proximity.

I currently have the google geo coding api setup and would prefer to use that.

TIA.

-- Update I have been googling about this for awhile and it appears that I am looking for a spatial query that returns groups by proximity.

... it appears that I am looking for a spatial query that returns groups by proximity. ...

You could use hdbscan. Your groups are actually clusters in hdbscan wording. You would need to work with min_cluster_size and min_samples to get your groups right.

https://hdbscan.readthedocs.io/en/latest/parameter_selection.html

https://hdbscan.readthedocs.io/en/latest/

It appears that hdbscan runs under Python.

Here are two links on how to call Python from PHP: Calling Python in PHP , Running a Python script from PHP

Here is some more information on which clustering algorithm to choose: http://nbviewer.jupyter.org/github/scikit-learn-contrib/hdbscan/blob/master/notebooks/Comparing%20Clustering%20Algorithms.ipynb

http://scikit-learn.org/stable/modules/clustering.html#clustering

Keep in mind that this problem grows exponentially with every user you add, as the amount of distance calculations is linked to the square of the number of users (it's actually N*(N-1) distances... so a 2000 user base would mean almost 4 million distance calculations on every pass. Just keep that in mind when sizing the resources you need

Are you looking to group them based on straight-line (actually great circle) distance or based on walking/driving distance?

If the former, the great circle distance can be approximated with simple math if you're able to tolerate a small margin of error and wish to assume the earth is a sphere. From GCMAP.com:

Earth's hypothetical shape is called the geoid and is approximated by an ellipsoid or an oblate sphereoid. A simpler model is to use a sphere, which is pretty close and makes the math MUCH easier. Assuming a sphere of radius 6371.2 km, convert longitude and latitude to radians (multiply by pi/180) and then use the following formula:

theta = lon2 - lon1
dist = acos(sin(lat1) × sin(lat2) + cos(lat1) × cos(lat2) × cos(theta))
if (dist < 0) dist = dist + pi
dist = dist × 6371.2

The resulting distance is in kilometers.

Now, if you need precise calculations and are willing to spend the CPU cycles needed for much complex math, you can use Vincenty's Formulae, which uses the WGS-84 reference ellipsoid model of the earth which is used for navigation, mapping and whatnot. More info HERE

As to the algorithm itself, you need to build a to-from matrix with the result of each calculation. Each row and column would represent each node. Two simplifications you may consider:

  1. Distance does not depend on direction of travel, so $dist[n][m] == $dist[m][n] (no need to calculate the whole matrix, just half of it)
  2. Distance from a node to itself is always 0, so no need to calculate it, but since you're intending to group by proximity, to avoid a user being grouped with itself, you may want to always force $dist[m][m] to an arbitrarily defined and abnormally large constant ( $dist[m][m] = 22000 (miles) for instance. Will work as long as all your users are on the planet)

After making all the calculations, use an array sorting method to find the X closest nodes to each node and there you have it (you may or may not want to prevent a user being grouped on more than one group, but that's just business logic)

Actual code would be a little too much to provide at this time without seeing some of your progress first, but this is basically what you need to do algoritmically.

Use GeoHash algorithm[1]. There is a PHP implementation[2]. You may pre-calculate geohashes with different precision, store them in SQL database alongside lat-lon values and query using native GROUP BY.

  1. https://en.wikipedia.org/wiki/Geohash
  2. https://github.com/lvht/geohash

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM