简体   繁体   中英

Calculating the distance between points in R

I looked through the questions that been asked but dealing with coordinates but couldn't find something can help me out with my problem.

I have dataset that contain ID, Speed, Time , List of Latitude & Longitude. ( dataset can be found in the link) https://drive.google.com/file/d/1MJUvM5WEhua7Rt0lufCyugBdGSKaHMGZ/view?usp=sharing

I want to measure the distance between each point of Latitude & Longitude. For example; Latitude has: x1 ,x2 ,x3 ,...x1000

Longitude has: y1 ,y2 ,y3 ,..., y100

I want to measure the distance between (x1,y1) to all the points , and (x2,y2) to all the points, and so on.

The reason I'm doing this to know which point close to which and assign index to each location based on the distance. if (x1, y1) is close to (x4,y4) so (x1, y1) will get the index A for example and (x4,y4) will get labeled as B. sort the points in order based on distance.

I tried gDistance function but showed error message: "package 'gDistance' is not available (for R version 3.4.3)"

and if I change the version to 3.3 library(rgeos) won't work !! Any suggestions?

here's what I tried,

#requiring necessary packages:
library(sp)  # vector data
library(rgeos)  # geometry ops

#Read the data and transform them to spatial objects
d <- read.csv("ReadyData.csv")
sp.ReadData <- d
coordinates(sp.ReadyData) <- ~Longitude + Latitude
d <- gDistance(sp.ReadyData, byid= TRUE)

here's update my solution, I created spatial object and made spatial data frame as follow:

#Create spatial object:
lonlat <- cbind(spatial$Longitude, spatial$Latitude)
#Create a SpatialPoints object:
library(sp) 
pts <- SpatialPoints(lonlat)
crdref <- CRS('+proj=longlat +datum=WGS84')
pts <- SpatialPoints(lonlat, proj4string=crdref)
# make spatial data frame
ptsdf <- SpatialPointsDataFrame(pts, data=spatial)

Now I'm trying to measure the Distance for longitude/latitude coordinates. I tried dist method but seems not working for me and tried pointDistance method:

gdis <- pointDistance(pts, lonlat=TRUE)

still not clear for me how this function can measure the distance, I need to figure out the distance so I can locate the point in the middle and assign numbers for each point based on its location from the middle point..

Assuming you have p1 as spatialpoints of x and p2 as spatialpoints of y, to get the index of the nearest other point:

ReadyData$cloDist <- apply(gDistance(p1, p2, byid=TRUE), 1, which.min)

If you have the same coordinate in the list you will get an index of the point itself since the closest place to itself is itself. An easy trick to avoid that is to use the second farthest distance as reference with a quick function:

f_which.min <- function(vec, idx) sort(vec, index.return = TRUE)$ix[idx]
ReadyData$cloDist2 <- apply(gDistance(p1, p2, byid=TRUE), 1, f_which.min, 
idx = 2)

You can use raster::pointDistance or geosphere::distm among others functions.

Part of your example data (please avoid files in your questions):

d <- read.table(sep=",", text='
"OBU ID","Time Received","Speed","Latitude","Longitude"
"1",20,1479171686325,0,38.929596,-77.2478813
"2",20,1479171686341,0,38.929596,-77.2478813
"3",20,1479171698485,1.5,38.9295887,-77.2478945
"4",20,1479171704373,1,38.9295048,-77.247922
"5",20,1479171710373,0,38.9294865,-77.2479055
"6",20,1479171710373,0,38.9294865,-77.2479055
"7",20,1479171710373,0,38.9294865,-77.2479055
"8",20,1479171716373,2,38.9294773,-77.2478712
"9",20,1479171716374,2,38.9294773,-77.2478712
"10",20,1479171722373,1.32,38.9294773,-77.2477417')

Solution:

library(raster)
m <- pointDistance(d[, c("Longitude", "Latitude")], lonlat=TRUE)

To get the nearest point to each point, you can do

mm <- as.matrix(as.dist(m))
diag(mm) <- NA
i <- apply(mm, 1, which.min)

The point pairs

p <- cbind(1:nrow(mm), i)    

To get the distances, you can do:

mm[p] 

Or do this:

apply(mm, 1, min, na.rm=TRUE)

Note that rgeos::gDistance is for planar data, not for longitude/latitude data.

Here is a similar question/answer with some illustration.

our data set is too large to make a single distance matrix. You can process your data in chunks to with that. Here I am showing that with a rather small chunk size of 4 rows. Make this number much bigger to speed up processing time.

library(geosphere)
chunk <- 4  # rows
start <- seq(1, nrow(d), chunk)
end <- c(start[-1], nrow(d))   
x <- d[, c("Longitude", "Latitude")]

r <- list()
for (i in 1:length(start)) {
    y <- x[start[i]:end[i], , drop=FALSE]
    m <- distm(y, x)
    m[cbind(1:nrow(m),  start[i]:end)] <- NA 
    r[[i]] <- apply(m, 1, which.min)
}
r <- unlist(r)
r
# [1] 2 1 1 5 6 6 5 5 9 8 8 8

So for your data:

d <- read.csv("ReadyData.csv")
chunk <- 100  # rows
# etc

This will take a long time.

An alternative approach:

library(spdep)
x <- as.matrix(d[, c("Longitude", "Latitude")])
k <- as.vector(knearneigh(x, k=1, longlat=TRUE)$nn)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM