简体   繁体   English

检查数据框中的单元格是否与另一列相同

[英]check if cells in data frame is identical to another column

I would like to check if the names in column "Pred1" and "Pred2" are identical to the names in column "Expected" for the same row. 我想检查“ Pred1”和“ Pred2”列中的名称是否与同一行的“ Expected”列中的名称相同。 If the names are identical it returns TRUE, else it return FALSE. 如果名称相同,则返回TRUE,否则返回FALSE。 I tried the identical() function, but I am not sure how to do this for each cell. 我尝试了identical()函数,但是不确定如何为每个单元格执行此操作。

in

Expected        Pred1           Pred2
Bacteroides     Bacillus        Bacteroides
Bifidobacterium Bifidobacterium  Escherichia

out

Expected        Pred1         Pred2
Bacteroides      FALSE         TRUE
Bifidobacterium  TRUE          FALSE

You could use outer . 您可以使用outer

fun <- Vectorize(function(x, y) identical(d[x, 1], d[x, y]))
cbind(d[1], Pred=outer(1:2, 2:3, fun))
#          Expected Pred.1 Pred.2
# 1     Bacteroides  FALSE   TRUE
# 2 Bifidobacterium   TRUE  FALSE

Or do it with == . 或者用==来做。

sapply(1:2, function(x) d[x, 1] == d[x, 2:3])
#       [,1]  [,2]
# [1,] FALSE  TRUE
# [2,]  TRUE FALSE

Data 数据

d <- structure(list(Expected = c("Bacteroides", "Bifidobacterium"), 
    Pred1 = c("Bacillus", "Bifidobacterium"), Pred2 = c("Bacteroides", 
    "Escherichia")), row.names = c(NA, -2L), class = "data.frame")

Solution using a for loop: 使用for循环的解决方案:

l <- list()
for(i in 2:length(df)){
   l[[i]] <- df[,1] == df[,i]
}
df1 <- as.data.frame(do.call(cbind,l))

Data: 数据:

df <- data.frame(Expected = c("Bacteriodes","Bifidobacterium"),Pred1 = c("Bacillus","Bifidobacterium"),Pred2 = c("Bacteriodes","Escherichia"),stringsAsFactors = F)

lapply() will loop through all of the columns that you want to check. lapply()将遍历您要检查的所有列。 The function used == will check equivalent with the right hand side which would be d[, 'Expected'] . 使用的==函数将检查与d[, 'Expected']右侧的等效项。

lapply(d[, c('Pred1', 'Pred2')], '==', d[, 'Expected'])
#equivalent to
lapply(d[, c('Pred1', 'Pred2')], function(x) x == d[, 'Expected'])

$Pred1
[1] FALSE  TRUE

$Pred2
[1]  TRUE FALSE

To get it into the right format, you can assign them back to the original columns. 为了使其格式正确,您可以将它们分配回原始列。 Note I made a copy but you can just as easily assign the results to the original data.frame. 注意我做了一个副本,但是您可以轻松地将结果分配给原始data.frame。

d_copy <- d

d_copy[, c('Pred1', 'Pred2')] <- lapply(d[, c('Pred1', 'Pred2')], '==', d[, 'Expected'])

d_copy
         Expected Pred1 Pred2
1     Bacteroides FALSE  TRUE
2 Bifidobacterium  TRUE FALSE

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM