简体   繁体   English

R 使用另一个列中的值索引数据框

[英]R indexing a data frame using values in the column of another

I have a data frame, and two of the columns are indices for another data fame.我有一个数据框,其中两列是另一个数据名的索引。 I want to add a column to the first by indexing the second, but just calling the column names isn't working.我想通过索引第二列向第一列添加一列,但仅调用列名是行不通的。 For example, if the first data frame is:例如,如果第一个数据帧是:

...  Gene    CellLine ...
     KRAS    HELA     ...
     BRCA1   T24      ...

and my second dataframe looks like我的第二个 dataframe 看起来像

        KRAS   BRCA1 ...
HELA    5      3
T24     2      1
...

I want the output to look like我希望 output 看起来像

...  Gene   CellLine   Dependency ...
     KRAS   HELA       5          ...
     BRCA1  T24        1          ...

without having to loop through the lines because the first data frame is massive.无需遍历线路,因为第一个数据帧很大。 That is, is there any function or package that would do the equivalent to也就是说,是否有任何 function 或 package 相当于

for (i in rownames(table1)){
  table1[i, dependency] <- ifelse(table1[i,"Gene"] %in% rownames(table2) & table1[i,"CellLine"] %in% colnames(table2), table2[table1[i,"Gene"],table1[i,"CellLine"]], NA)
}

but faster?但更快?

Thanks!谢谢!

The following code is vectorized, it creates an index matrix with the two columns from df1 and uses it to extract the required values from df2 .以下代码是矢量化的,它使用df1的两列创建一个索引矩阵,并使用它从df2中提取所需的值。

inx <- as.matrix(df1[c("CellLine", "Gene")])
df1$Dependency <- df2[inx]

df1
#   Gene CellLine Dependency
#1  KRAS     HELA          5
#2 BRCA1      T24          1

Data数据

df1 <- read.table(text = "
Gene    CellLine 
KRAS    HELA
BRCA1   T24 
", header = TRUE)

df2 <- read.table(text = "
        KRAS   BRCA1
HELA    5      3
T24     2      1
", header = TRUE)

You can try this approach.你可以试试这个方法。 The data used is next:使用的数据如下:

#Data
df1 <- structure(list(Gene = c("KRAS", "BRCA1"), CellLine = c("HELA", 
"T24")), class = "data.frame", row.names = c(NA, -2L))
df2 <- structure(list(id = c("HELA", "T24"), KRAS = c(5L, 2L), BRCA1 = c(3L, 
1L)), class = "data.frame", row.names = c(NA, -2L))

Then the code, you can melt and merge data:然后代码,就可以meltmerge数据了:

library(reshape)
#Melt df2 
Melted <- melt(df2,id.vars = 'id')
#Now merge
Merged <- merge(df1,Melted,by.x=c('Gene','CellLine'),by.y=c('variable','id'),all.x=T)

The result would be next:结果将是下一个:

   Gene CellLine value
1 BRCA1      T24     1
2  KRAS     HELA     5

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在R中索引另一个数据帧 - Indexing another data frame in R 使用 R 中的两列值快速查找另一个数据框 - Quickly lookup to another data frame using two column values in R 使用一个data.frame中的数据为R中另一个data.frame中的新列生成值 - Using data in one data.frame to generate values for a new column in another data.frame in R R如何使用另一个data.frame中的值更新data.frame中的列 - R How to update a column in data.frame using values from another data.frame 如何使用 R 中另一个数据帧中包含的值对数据帧的列进行计算? - How to do calculations on a column of a data frame using values contained in another data frame in R? 检查 R data.frame 列在另一列中是否有相等的值 - Check R data.frame column for equal values in another column 根据R中另一列的值乘以数据框列的值 - Multiplying data frame column values based on the value of another column in R R:根据另一列操作一个数据框列的值 - R: Manipulate values of one data frame column based on another column R 故障排除:根据数据框中另一列中的值对数据框中的一列的值求和 - R Troubleshooting: Sum values of one column in a data frame based on values in another column of the data frame R根据另一个数据框的精确匹配替换列的值 - R replace values of a column based on exact match of another data frame
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM