[英]Replace a cell with NA according to value in another cell in R
I have a dataset from which I made a reproducible example: 我有一个数据集,可以从中得到一个可重现的示例:
set.seed(1)
Data <- data.frame(
A = sample(0:5),
B = sample(0:5),
C = sample(0:5),
D = sample(0:5),
corr_A.B = sample(0:5),
corr_A.C = sample(0:5),
corr_A.D = sample(0:5))
> Data
A B C D corr_A.B corr_A.C corr_A.D
1 1 5 4 2 1 2 4
2 5 3 1 3 5 5 0
3 2 2 3 4 0 1 2
4 3 0 5 0 4 0 1
5 0 4 2 1 2 3 3
6 4 1 0 5 3 4 5
And I would like to check, for each column B, C and D, if one of their cell is equal to 0, I would like to replace, on the same row, the corresponding corr_A column with NA. 我想检查B,C和D的每一列,如果它们的一个像元等于0,我想在同一行上用NA替换对应的corr_A列。 For instance, since Data$B[4] is equal to 0, I would like Data$corr_A.B[4] to be replaced by NA.
例如,由于Data $ B [4]等于0,我希望将Data $ corr_A.B [4]替换为NA。
I look to obtain the following result: 我希望获得以下结果:
> Data
A B C D corr_A.B corr_A.C corr_A.D
1 1 5 4 2 1 2 4
2 5 3 1 3 5 5 0
3 2 2 3 4 0 1 2
4 3 0 5 0 NA 0 NA
5 0 4 2 1 2 3 3
6 4 1 0 5 3 NA 5
I have tried different ways, using for loops, but I am struggling a lot. 我尝试了不同的方法,使用for循环,但是我很努力。 Also, in the dataset I am working on, there are many other columns that do not need to be checked for that condition, I would like to be able to specifically designated in which columns I am looking for 0 values.
另外,在我正在处理的数据集中,有许多其他列不需要针对该条件进行检查,我希望能够专门指定要在哪个列中查找0值。
If someone would be kind enough to give it a try? 如果有人愿意尝试一下? Many thanks
非常感谢
A one-liner using function is.na<-
. 一线使用函数
is.na<-
。
is.na(Data[5:7]) <- Data[2:4] == 0
Data
# A B C D corr_A.B corr_A.C corr_A.D
#1 1 5 4 2 1 2 4
#2 5 3 1 3 5 5 0
#3 2 2 3 4 0 1 2
#4 3 0 5 0 NA 0 NA
#5 0 4 2 1 2 3 3
#6 4 1 0 5 3 NA 5
For a base R solution, we can just use ifelse
here: 对于基本的R解决方案,我们可以在此处使用
ifelse
:
Data$corr_A.B <- ifelse(Data$B == 0, NA, Data$corr_A.B)
Data$corr_A.C <- ifelse(Data$C == 0, NA, Data$corr_A.C)
Data$corr_A.D <- ifelse(Data$D == 0, NA, Data$corr_A.D)
df<- data.frame(A=c(1,5,2,3,0,4),
B=c(5,3,2,0,4,1),
C=c(4,1,3,5,2,0),
D=c(2,3,4,0,1,5),
corr_A.B=c(1,5,0,4,2,3),
corr_A.C=c(2,5,1,0,3,4),
corr_A.D=c(4,0,2,1,3,5))
df %>% mutate(corr_A.B=case_when(B==0 ~ NA_real_,
TRUE~ corr_A.B),
corr_A.C=case_when(C==0 ~NA_real_,
TRUE ~ corr_A.C),
corr_A.D=case_when(D==0 ~ NA_real_,
TRUE ~ corr_A.D))
A B C D corr_A.B corr_A.C corr_A.D
1 1 5 4 2 1 2 4
2 5 3 1 3 5 5 0
3 2 2 3 4 0 1 2
4 3 0 5 0 NA 0 NA
5 0 4 2 1 2 3 3
6 4 1 0 5 3 NA 5
基本的单线矢量化但复杂的解决方案:
Data[t(t(which(Data[,2:4]==0,arr.ind=TRUE))+c(0,4))]<-NA
Using apply()
. 使用
apply()
。 You could do: 您可以这样做:
cbind(Data,apply(Data[c("B","C","D")],2,function(x){
ifelse(x==0,NA,x)
}))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.