简体   繁体   中英

Match column and row names to column and values in another data frame

I have two data frames as follows :

x<-data.frame("Trait1" =c(1,1,0,1),
          "Trait3" =c(0,1,0,1))

y <- matrix(c("A","A","B","C","D","C"), 
           nrow = 2, ncol = 3, byrow = TRUE, dimnames = list(c("individual1", "individual2"), 

Such that:


  Trait1 Trait2 Trait3
A      1      1      0
B      1     NA      1
C      0      1      0
D      1      1      1


         Trait1 Trait2 Trait3
individual1 "A"    "A"    "B"   
individual2 "C"    "D"    "C"  

I need to match the row names of x with the values in y, and the column names in both data frames to get values for each individual as follows:

         Trait1 Trait2 Trait3
individual1   1      1      1   
individual2   0      1      0  

Any suggestions would be much appreciated. Thanks.

A possible solution with tidyverse : just a matter of joining tables using information about each treatment number and treatment name, so the first step is to transform ( gather ) the two data sets into a common form where the treatment number and the treatment are both columns, not column names or row names.


x %>% mutate(v=rownames(.)) %>% 
  gather(k,w,-v) -> x1
y %>% as.data.frame(stringsAsFactors=FALSE) %>% mutate(ID=rownames(.)) %>%
  gather(k,v,-ID) %>% 
  inner_join(x1,by=c("k","v")) %>% 
  select(-v) %>% spread(k,w)

#           ID Trait1 Trait2 Trait3
#1 individual1      1      1      1
#2 individual2      0      1      0

I don't really like my solution and I feel there should be a better way to do this but for the time being this should work(it's not that nice though).

Here's the code:

t(sapply(1:nrow(y),function(i) sapply(1:ncol(y),function(j) x[match(y[i,],rownames(x)) 


     [,1] [,2] [,3]
[1,]    1    1    1
[2,]    0    1    0



The above matches each column of the i'th row of y to the rownames of x. For i=1 the result is:

[1] 1 1 2

Each element in this vector is the row's of x we'll use. Now we just need to match it to the columns of y(The order of the vector corresponds to the columns of y ie element 1 corresponds to column 1(trait1) and element 2 to column 2(trait2)).So we apply to each column of y as follows:

For i=1

sapply(1:ncol(y),function(j) x[match(y[i,],rownames(x))[j],j])
#[1] 1 1 1

This is the first row of your new matrix now we just apply this to each row of y to get the other rows of the new matrix:

t(sapply(1:nrow(y),function(i) sapply(1:ncol(y),function(j)                                                                

*Note I take the transpose since sapply returns it per column.

Anyway this works and you can just name the the new matrix the same as with y but the solution is a bit complicated for something I feel should be simpler, so check if you can improve the code. Maybe sapply isn't needed if you can use the following statement a bit better:


Here's a proposal:

#make table of row coordinates

#make table of col coordinates

#Use coordinates in mapply function to produce the final table.


The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM