[英]R: more efficient solution than this for-loop
I wrote a functioning for loop, but it's slow over thousands of rows and I'm looking for more efficient alternative. 我写了一个函数for循环,但它在数千行上很慢,我正在寻找更有效的替代方案。 Thanks in advance! 提前致谢!
The task: 任务:
a
matches column b
, column d
becomes NA
. 如果列a
与列b
匹配,则列d
变为NA
。 a
does not match b
, but b
matches c
, then column e
becomes NA
. 如果列a
与b
不匹配,但b
与c
匹配,则列e
变为NA
。 The for loop: for循环:
for (i in 1:nrow(data)) {
if (data$a[i] == data$b[i]) {data$d[i] <- NA}
if (!(data$a[i] == data$b[i]) & data$b[i] == data$c[i])
{data$e[i] <- NA}
}
An example: 一个例子:
a b c d e
F G G 1 10
F G F 5 10
F F F 2 8
Would become: 会成为:
a b c d e
F G G 1 NA
F G F 5 10
F F F NA 8
If you're concerned about speed and efficiency, I'd recommend data.table
(though technically vectorizing a normal data.frame
as recommended by @parfait would probably speed things up more than enough) 如果您担心速度和效率,我建议使用data.table
(虽然技术上按照data.frame
推荐的正常data.frame
矢量化可能会加快速度)
library(data.table)
DT <- fread("a b c d e
F G G 1 10
F G F 5 10
F F F 2 8")
print(DT)
# a b c d e
# 1: F G G 1 10
# 2: F G F 5 10
# 3: F F F 2 8
DT[a == b, d := NA]
DT[!a == b & b == c, e := NA]
print(DT)
# a b c d e
# 1: F G G 1 NA
# 2: F G F 5 10
# 3: F F F NA 8
Suppose df
is your data then: 假设df
是你的数据:
ab <- with(df, a==b)
bc <- with(df, b==c)
df$d[ab] <- NA
df$e[!ab & bc] <- NA
which would result in 这将导致
# a b c d e
# 1 F G G 1 NA
# 2 F G F 5 10
# 3 F F F NA 8
We could create a list of quosure and evaluate it 我们可以创建一个quosure列表并对其进行评估
library(tidyverse)
qs <- setNames(quos(d*NA^(a == b), e*NA^((!(a ==b) & (b == c)))), c("d", "e"))
df1 %>%
mutate(!!! qs)
# a b c d e
#1 F G G 1 NA
#2 F G F 5 10
#3 F F F NA 8
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.