I recently asked, and then answered, my own question after finding out it was a duplicate here:
There, I used the eurodist
dataset to find the closest neighbouring city Neigh
of a city City
based on the mean distance. I did this using split()
along with lapply()
.
library(data.table) # load package for transpose()
data(eurodist) # load eurodist data
labs <- labels(eurodist) # get city names
splt <- split(eurodist, labs) # split by city name
splt_mean <- lapply(splt, mean) # calculate mean for each city
x <- as.data.frame(splt_mean) # convert to data frame
x <- transpose(x) # transpose dataframe
colnames(x) <- "Mean" # name columns
rownames(x) <- labs # name rows
d <- data.frame(`diag<-`(as.matrix(dist(x$Mean)), Inf))
ids <- unlist(Map(which.min, d))
Neigh <- x$Mean[ids]
x <- data.frame(labs, x$Mean, Neigh)
names(x)[1] <- "City"
names(x)[2] <- "Mean"
x[, 3] <- x$City[ids]
I've successfully applied the solution to my own data and now have one more step which I'm unable to figure out.
I'd like to order()
splt
so that corresponding row elements in City
and Neigh
occur together, City
first followed by Neigh
. For instance, calling the new list splt_sort
, I need:
splt_sort
$Athens
[1] 3313 1326 966 330 1209 1418 328 2198 2250 618
$Rome
[1] 3927 204 747 789 1497 158 550 1178 2097 2707
...
Any thoughts?
I'll provide an answer to my own question, but @akrun deserves credit.
Their solution was a single line of R code:
splt2 <- splt[c(t(x[, c("City", "Neighbour")]))]
where x
is subsetted to extract the c
oncatenated vector comprising both City
and Neighbour
columns, then t
ransposed and c
oncatenated a final time before being applied to splt
.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.