简体   繁体   中英

Most efficient way to sort two vectors in lockstep in R?

What's the most efficient way to sort two vectors in lockstep in R? The first vector should be sorted in ascending order and the second should be reordered in lockstep such that elements with corresponding indices before the sort still have corresponding indices after the sort. For example:

foo <- c(1,3,2, 5,4)
bar <- c(2,6,4,10,8)
sort2(foo, bar)

# foo == c(1,2,3,4, 5)
# bar == c(2,4,6,8,10)

Note: Efficiency is an absolute must here as I am trying to use this as the basis for creating an O(N log N) implementation of Kendall's Tau to submit as a patch. I'd like to avoid writing my own special function in C to do this, but would be willing to if it can't be done efficiently within R.

Not sure I understand but is this use of order() what you want:

R> foo <- c(1,3,2, 5,4)
R> bar <- c(2,6,4,10,8)
R> fooind <- order(foo)   # index of ordered 
R> foo[fooind]
[1] 1 2 3 4 5
R> bar[fooind]
[1]  2  4  6  8 10
R> 

I'm not sure that the accepted answer is correct in cases where X is sorted first, then Y is sorted by the index of (sorted) X, in that if there are duplicate values in X, Y does not always get sorted in classic 'order by x, y' style. For example:

> x <- c(3,2,2,2,1)
> y <- c(5,4,3,2,1)
> xind <- order(x)
> x[xind]
[1] 1 2 2 2 3
> y[xind]
[1] 1 4 3 2 5

Y is ordered by the new order of X, but not in lockstep, as not all of the X indexes changed. A simple function to do as the OP required:

> sort.xy <- function(x,y)
+ {
+ df.xy <- data.frame(x,y)
+ df.xy[ order(df.xy[,1], df.xy[,2]), ]
+ }

In use:

> c(sort.xy(x,y))
$x
[1] 1 2 2 2 3

$y
[1] 1 2 3 4 5

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM