[英]Replacing NA in for/if loop in R
I'm running into an unexpected challenge in R. In my dataset, there are NA in certain columns.我在 R 中遇到了一个意想不到的挑战。在我的数据集中,某些列中有 NA。 Some of these NAs SHOULD be present (the values are truly missing), while others should be replaced with 0s.
其中一些 NA 应该存在(值确实缺失),而其他的应该用 0 替换。 I used code like the following:
我使用了如下代码:
df1 <- data.frame(x = c(1, 2, 3, 4, 5), y = c(10, 10, NA, NA, 12), z = c(9, 9, 9, 9, 9))
for (i in nrow(df1)){
if(df1$x[i] > 3){
df1$y[i] = 0
df1$z[i] = 0
}
}
And obtained this output并获得了这个输出
x y z
1 1 10 9
2 2 10 9
3 3 NA 9
4 4 NA 9
5 5 0 0
The NA SHOULD be preserved in row 3, but the NA in row 4 should have been replaced with 0. Further, the z value in row 4 did not update. NA 应该保留在第 3 行中,但第 4 行中的 NA 应该被替换为 0。此外,第 4 行中的 z 值没有更新。 Any ideas as to what is happening?
关于发生了什么的任何想法?
You've used for i in nrow(df1)
which evaluates to for i in 5
.您已经使用
for i in nrow(df1)
其计算结果为for i in 5
。 I'm guessing you meant to use for i in 1:nrow(df1)
, which would evaluate to for i in 1:5
and include all rows.我猜你的意思是
for i in 1:nrow(df1)
使用for i in 1:nrow(df1)
,它会for i in 1:5
进行评估并包括所有行。
Don't do it this way, R it's not Python, you get your vectorized functions out of the box:不要这样做,R 它不是 Python,您可以开箱即用地获得矢量化函数:
df1[df1$x > 3, c('y', 'z')] <- 0
df1
# x y z
# 1 1 10 9
# 2 2 10 9
# 3 3 NA 9
# 4 4 0 0
# 5 5 0 0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.