简体   繁体   中英

R: Sort multiple columns by another data.frame?

I'm trying to make sense of how to sort one data.frame based on multiple columns in another. This question does this with vectors . Can someone suggest a way to do the equivalent with data.frames?

Here's some sample data.

x1 <- data.frame(a=1:5, b=letters[1:5], c=rnorm(5))
x2 <- data.frame(a=c(4,4,2), b=c("d", "d", "b"), d=rnorm(3))

So I want to sort x2 by the first two columns of x1. My actual data is much more complicated, but this replicates the idea...

It really depends on what your data really looks like. As it looks right now, you only need one column to sort, and that is easily done by:

x2[order(match(x2[,1],x1[,1])),]

If you need more than one column, this becomes a bit trickier. You will have to specify which one you want to sort first on, and which one second, eg :

x1 <- data.frame(a=rep(1:3,2), b=rep(letters[2:4],each=2), c=rnorm(6))
x2 <- data.frame(a=c(3,3,2), b=c("c", "d", "b"), d=rnorm(3))


x2[order(match(
  paste(x2[,1],x2[,2]),
  paste(x1[,1],x1[,2]))
),]

This sorts on the first column first, and then on the second. You have to keep in mind that you need all combinations in x2 also in x1. T

This can be done precisely using plyr . Joris' answer would work fairly well, but potentially could missort when combining strings:

> paste ("A A","B")
[1] "A A B"
> paste ("A","A B")
[1] "A A B"

You can get an exact answer using join.keys and match :

x1 <- data.frame(a=rep(1:3,2), b=rep(letters[2:4],each=2), c=rnorm(6))
x2 <- data.frame(a=c(3,3,2), b=c("c", "d", "b"), d=rnorm(3))

library(plyr)
keys<-join.keys(x1,x2,c("a","b"))
matches<-match(keys$y,keys$x,nomatch=(keys$n+1))
x2[order(matches),]

This should handle most edge cases, mismatched list sizes, etc. Items without a match in both the index columns are put at the end of the list.

Attach a rank column to the relevant columns of x1:

len <- dim(x1)[1]
x1. <- cbind(x1[,1:2], rank=1:len)

Merge into x2 (this is like a SQL join; see the merge documentation for how to specify what happens if there are ambiguities such as multiple matches or no matches):

x2. <- merge(x2, x1.)

Sort:

x2.[order(x2.[,'rank']),]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM