根据单个单元格值更改多个单元格值

Question

I have a dataframe: 我有一个数据框：

a = c("yes", "yes", "no", "yes", "no")
b = c("brown", "grey", "white", "grey", NA)
c = c(7, 6, NA, 10, 8)
d = c("male", "female", "female", "male", "female")
Zoo = cbind.data.frame(a, b, c, d)
colnames(Zoo) = c("animal", "colour", "age", "gender")    

   animal colour  age  gender
    yes    brown   7   male
    yes    grey    6 female
    no     white  NA female
    yes    grey   10   male
    no     NA      8 female

If the value for 'animal' is no, I would like to change any non-NA values in the corresponding columns to "NL" (for non-logical). 如果“动物”的值为no，我想将对应列中的所有非NA值更改为“ NL”（对于非逻辑）。 I can do this one column at a time as follows: 我可以一次完成这一栏，如下所示：

Zoo$colour = as.character(Zoo$colour)

Zoo$colour = 
  ifelse(Zoo$animal == "no" & !is.na(Zoo$colour), "NL", Zoo$colour)

and eventually arrive at this: 并最终得出：

   animal colour  age  gender
    yes    brown   7   male
    yes    grey    6 female
    no     NL     NA     NL
    yes    grey   10   male
    no     NA     NL     NL

I'm sure there is a way of doing this more efficiently. 我敢肯定，有一种方法可以更有效地做到这一点。 Is there? 在那儿？ Thank you! 谢谢！

Answer 1

Here is another way. 这是另一种方式。 Notice that I create a data.frame with stringsAsFactors = FALSE because working with factor levels in this setting is tedious. 请注意，我创建一个带有stringsAsFactors = FALSE的data.frame，因为在此设置中使用因子水平很繁琐。 You can freely convert character columns to factors once you're done with this. 完成此操作后，您可以将字符列自由转换为因子。

Basically, this code goes through each row, finds columns which have non-NAs and inserts "NL" in their place. 基本上，此代码遍历每一行，查找具有非NA的列，并在其位置插入"NL" 。

a = c("yes", "yes", "no", "yes", "no")
b = c("brown", "grey", "white", "grey", NA)
c = c(7, 6, NA, 10, 8)
d = c("male", "female", "female", "male", "female")
zoo <- data.frame(animal = a, color = b, age = c, gender = d, stringsAsFactors = FALSE)

for (i in 1:nrow(zoo)) {
  if (zoo[i, "animal"] == "no") {
    find.el <- !is.na(zoo[i, which(colnames(zoo) != "animal")])
    zoo[, 2:ncol(zoo)][i, find.el] <- "NL"
  }
}

  animal color  age gender
1    yes brown    7   male
2    yes  grey    6 female
3     no    NL <NA>     NL
4    yes  grey   10   male
5     no  <NA>   NL     NL

Answer 2

For multiple columns, we can use the efficient approach with set from data.table 对于多列，我们可以对data.table set使用有效的方法

library(data.table)
setDT(Zoo)
for(nm in names(Zoo)[-1]) {
  set(Zoo, i = NULL, j = nm, as.character(Zoo[[nm]]))
  set(Zoo, i = which(Zoo[['animal']]=='no' & !is.na(Zoo[[nm]])),
      j = nm, value = "NL")
}

Zoo
#   animal colour age gender
#1:    yes  brown   7   male
#2:    yes   grey   6 female
#3:     no     NL  NA     NL
#4:    yes   grey  10   male
#5:     no     NA  NL     NL

NOTE: This should be very efficient as we are using set 注意：这应该非常有效，因为我们正在使用set

Or otherwise, we can use the elegant tidyverse syntax 否则，我们可以使用优雅的tidyverse语法

library(dplyr)
Zoo %>% 
   mutate_at(2:4, funs(replace(., Zoo[['animal']]== 'no' & !is.na(.), 'NL')))
#   animal colour  age gender
#1    yes  brown    7   male
#2    yes   grey    6 female
#3     no     NL <NA>     NL
#4    yes   grey   10   male
#5     no   <NA>   NL     NL

Benchmarks 基准

Zoo1 <- Zoo[rep(1:nrow(Zoo), 1e5),]
Zoo2 <- copy(Zoo1)
Zoo3 <- copy(Zoo2)

system.time({
setDT(Zoo2)
for(nm in names(Zoo2)[-1]) {
  set(Zoo2, i = NULL, j = nm, as.character(Zoo2[[nm]]))
  set(Zoo2, i = which(Zoo[['animal']]=='no' & !is.na(Zoo2[[nm]])),
      j = nm, value = "NL")
}
})
# user  system elapsed 
#   0.40    0.01    0.42 

system.time({
  Zoo3 %>% 
   mutate_at(2:4, funs(replace(., Zoo3[['animal']]== 'no' & !is.na(.), 'NL')))
 })
 #user  system elapsed 
 #  0.42    0.03    0.46 


system.time({
 for (i in 1:nrow(Zoo1)) {
  if (Zoo1[i, "animal"] == "no") {
    find.el <- !is.na(Zoo1[i, which(colnames(Zoo1) != "animal")])
    Zoo1[, 2:ncol(Zoo1)][i, find.el] <- "NL"
  }
}
})
#     user  system elapsed 
#  2086.49  577.51 2686.83

data 数据

Zoo <- data.frame(animal = a, colour = b, age = c, gender = d, stringsAsFactors=FALSE)

根据单个单元格值更改多个单元格值

问题描述

2 个解决方案

解决方案1
3 2017-07-13 06:34:59

解决方案2
0 2017-07-13 06:24:51

Benchmarks 基准

data 数据

根据单个单元格值更改多个单元格值

问题描述

2 个解决方案

解决方案1 3 2017-07-13 06:34:59

解决方案2 0 2017-07-13 06:24:51

Benchmarks 基准

data 数据

解决方案1
3 2017-07-13 06:34:59

解决方案2
0 2017-07-13 06:24:51