简体   繁体   中英

Difference between distm function or the distVincentyEllipsoid in R

Could you fully explain the big difference in using the distm function or the distVincentyEllipsoid function to calculate the distance of geodesic coordinates in R?

I noticed that using distm for this calculation, it takes much longer. Could you please explain to me beyond the difference, why does this happen?

Thank you!

Following on from your previous question here: Distance calculation optimization in R

The speed relates to the level of computation required to produce the returned object, not necessarily the difference between the computation of distances (I am not sure what great circle computation the distm() function uses as it's default). Indeed the geosphere:: documentation here:https://cran.r-project.org/web/packages/geosphere/geosphere.pdf suggests that distVincentyEllipsoid() calculation is "very accurate" but "computationally more intensive" than other great circle methods while this would make you suspect a slower computation, it is because of the way I have structured the code in my answer to return a vector of distances between each row (not a matrix of distances between each and every point).

Conversely, your distm() calculation in your original code returns a matrix of multiple vectors between each and every point. For your problem, this is not necessary so long as the data is ordered, that is why I have done so. Additionally, the use of hierarchical clustering to cluster the points based on these distances into 3 (your defined number) clusters is also not necessary as we can use the percentile of distances between each point values to do the same. Again the speed benefit relates to computing the clusters on a single vector rather than a matrix.

Please note, I am a data analyst with a background in accounting/finance and not a GIS specialist by any means. That being said my use of the distVincentyEllipsoid() function comes from my general understanding that this returns a pretty accurate estimation of great circle distances as a vector (as a opposed to a matrix). Moreover, having used this in the past to optimise logistics operations for pricing purposes, I can attest to the fact these computations have been tested in the market and found to be sound.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM