简体   繁体   English

将列名和行名与另一个数据框中的列和值匹配

[英]Match column and row names to column and values in another data frame

I have two data frames as follows :我有两个数据框如下:

x<-data.frame("Trait1" =c(1,1,0,1),
          "Trait2"=c(1,NA,1,1),
          "Trait3" =c(0,1,0,1))
rownames(x)<-c("A","B","C","D") 

y <- matrix(c("A","A","B","C","D","C"), 
           nrow = 2, ncol = 3, byrow = TRUE, dimnames = list(c("individual1", "individual2"), 
                                                             c("Trait1","Trait2","Trait3")))

Such that:这样:

x X

  Trait1 Trait2 Trait3
A      1      1      0
B      1     NA      1
C      0      1      0
D      1      1      1

y

         Trait1 Trait2 Trait3
individual1 "A"    "A"    "B"   
individual2 "C"    "D"    "C"  

I need to match the row names of x with the values in y, and the column names in both data frames to get values for each individual as follows:我需要将 x 的行名称与 y 中的值以及两个数据框中的列名称进行匹配,以获取每个人的值,如下所示:

         Trait1 Trait2 Trait3
individual1   1      1      1   
individual2   0      1      0  

Any suggestions would be much appreciated.任何建议将不胜感激。 Thanks.谢谢。

A possible solution with tidyverse : just a matter of joining tables using information about each treatment number and treatment name, so the first step is to transform ( gather ) the two data sets into a common form where the treatment number and the treatment are both columns, not column names or row names. tidyverse 的一个可能解决方案:只需使用有关每个治疗编号和治疗名称的信息连接表,因此第一步是将两个数据集转换( gather )为一个通用形式,其中治疗编号和治疗都是列, 不是列名或行名。

library(dplyr)
library(tidyr)

x %>% mutate(v=rownames(.)) %>% 
  gather(k,w,-v) -> x1
y %>% as.data.frame(stringsAsFactors=FALSE) %>% mutate(ID=rownames(.)) %>%
  gather(k,v,-ID) %>% 
  inner_join(x1,by=c("k","v")) %>% 
  select(-v) %>% spread(k,w)

#           ID Trait1 Trait2 Trait3
#1 individual1      1      1      1
#2 individual2      0      1      0

I don't really like my solution and I feel there should be a better way to do this but for the time being this should work(it's not that nice though).我不太喜欢我的解决方案,我觉得应该有更好的方法来做到这一点,但目前这应该可行(虽然它不是那么好)。

Here's the code:这是代码:

t(sapply(1:nrow(y),function(i) sapply(1:ncol(y),function(j) x[match(y[i,],rownames(x)) 
[j],j])))

Output:输出:

     [,1] [,2] [,3]
[1,]    1    1    1
[2,]    0    1    0

Explanation:解释:

match(y[i,],rownames(x))

The above matches each column of the i'th row of y to the rownames of x.以上将 y 的第 i 行的每一列与 x 的行名匹配。 For i=1 the result is:对于 i=1,结果是:

[1] 1 1 2

Each element in this vector is the row's of x we'll use.这个向量中的每个元素都是我们将使用的 x 的行。 Now we just need to match it to the columns of y(The order of the vector corresponds to the columns of y ie element 1 corresponds to column 1(trait1) and element 2 to column 2(trait2)).So we apply to each column of y as follows:现在我们只需要将它匹配到 y 的列(向量的顺序对应于 y 的列,即元素 1 对应于列 1(trait1),元素 2 对应于列 2(trait2))。所以我们应用到每个y 列如下:

For i=1对于 i=1

sapply(1:ncol(y),function(j) x[match(y[i,],rownames(x))[j],j])
#[1] 1 1 1

This is the first row of your new matrix now we just apply this to each row of y to get the other rows of the new matrix:这是新矩阵的第一行,现在我们只需将其应用于 y 的每一行以获得新矩阵的其他行:

t(sapply(1:nrow(y),function(i) sapply(1:ncol(y),function(j)                                                                
    x[match(y[i,],rownames(x))[j],j])))

*Note I take the transpose since sapply returns it per column. *注意我采用转置,因为 sapply 每列返回它。

Anyway this works and you can just name the the new matrix the same as with y but the solution is a bit complicated for something I feel should be simpler, so check if you can improve the code.无论如何,这可行,您可以将新矩阵命名为与 y 相同的名称,但解决方案有点复杂,因为我认为应该更简单,因此请检查您是否可以改进代码。 Maybe sapply isn't needed if you can use the following statement a bit better:如果您可以更好地使用以下语句,则可能不需要 sapply:

i=1
match(y[i,],rownames(x))

Here's a proposal:这是一个建议:

#make table of row coordinates
coordinaterow<-y

#make table of col coordinates
coordinatecol<-matrix(colnames(y), 
                      nrow=nrow(y), 
                      ncol=ncol(y), 
                      byrow=TRUE)

#Use coordinates in mapply function to produce the final table.
finalresult<-y  

finalresult[]<-mapply(function(r,c)
  x[r,c],
  coordinaterow,
  coordinatecol,
  SIMPLIFY = TRUE)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 将行名称和列名称与另一个数据框中的值匹配 - Match row names and column names to values in another data frame 使用行中的值匹配另一个数据框中的列和行 - Match column and row in another data frame using values from row 创建新的数据框,将列名作为行名,并将一列中的值作为新列名 - Create new data frame with column names as row names, and values from one column as new column names 根据另一行中另一列的值将列添加到数据框 - Add column to data frame based on values of another column in another row 在每个数据帧行中,将每列与另一个数据帧中的键匹配,并在新数据帧中对键的值求和 - In each data frame row, match each column to a key in another data frame, and sums the values of the key in a new data frame 如何根据 R 中另一个数据框中的值更改行名和列名? - How to change row and column names based on values in another data frame in R? 通过定界符将行名称拆分为数据帧中的另一列 - Splitting row names by delimiter into another column in an data frame 将列中的值转换为现有数据框中的行名称 - Convert the values in a column into row names in an existing data frame 使用另一个数据框中的唯一值和分配给列的相应值创建具有列名的新数据框 - Create New Data Frame with Column Names from Unique Values in another Data Frame and Corresponding Values Assigned to Column 使用列值作为列名转换数据框 - Transform data frame with column values as column names
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM