简体   繁体   中英

Euclidean Distances between rows of two data frames in R

Calculating Euclidean Distances in R is easy. A good example can be found HERE . The vectorised form is:

sqrt((known_data[, 1] - unknown_data[, 1])^2 + (known_data[, 2] - unknown_data[, 2])^2)

What would be the fastest, most efficient way to get Euclidean Distances for each row of one data frame with all rows of another data frame? A particular function from apply() family? Thanks!

Maybe you can try outer + dist like below

outer(
  1:nrow(known_data),
  1:nrow(unknown_data),
  FUN = Vectorize(function(x,y) dist(rbind(known_data[x,],unknown_data[y,])))
)

I would use the dist() function (which is very efficient) on the combination of the two data frames and then remove the unneeded distances, if you like. Example:

df1 <- iris[1:5, -5]
df2 <- iris[6:10, -5]

all_distances <- dist(rbind(df1, df2))
all_distances <- as.matrix(all_distances)

# remove unneeded distances
all_distances[1:5, 1:5] <- NA
all_distances[6:10, 6:10] <- NA

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM