简体   繁体   English

R中的向量化IF语句?

[英]Vectorized IF statement in R?

x <- seq(0.1,10,0.1)
y <- if (x < 5) 1 else 2

I would want the if to operate on every single case instead of operating on the whole vector.我希望if对每个案例进行操作,而不是对整个向量进行操作。 What do I have to change?我需要改变什么?

x <- seq(0.1,10,0.1)

> x
  [1]  0.1  0.2  0.3  0.4  0.5  0.6  0.7  0.8  0.9  1.0  1.1  1.2  1.3  1.4  1.5
 [16]  1.6  1.7  1.8  1.9  2.0  2.1  2.2  2.3  2.4  2.5  2.6  2.7  2.8  2.9  3.0
 [31]  3.1  3.2  3.3  3.4  3.5  3.6  3.7  3.8  3.9  4.0  4.1  4.2  4.3  4.4  4.5
 [46]  4.6  4.7  4.8  4.9  5.0  5.1  5.2  5.3  5.4  5.5  5.6  5.7  5.8  5.9  6.0
 [61]  6.1  6.2  6.3  6.4  6.5  6.6  6.7  6.8  6.9  7.0  7.1  7.2  7.3  7.4  7.5
 [76]  7.6  7.7  7.8  7.9  8.0  8.1  8.2  8.3  8.4  8.5  8.6  8.7  8.8  8.9  9.0
 [91]  9.1  9.2  9.3  9.4  9.5  9.6  9.7  9.8  9.9 10.0

> ifelse(x < 5, 1, 2)
  [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 [38] 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 [75] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2

y <- if (x < 5) 1 else 2 does not operate on the whole vector (the warning you receive tells you only the first element of the condition will be used). y <- if (x < 5) 1 else 2不对整个向量进行操作(您收到的警告告诉您只会使用条件的第一个元素)。 You want ifelse :你想要ifelse

y <- ifelse(x < 5, 1, 2)

ifelse operates on the whole logical vector, element-by-element. ifelse对整个逻辑向量逐个元素进行操作。 if only accepts one logical value. if只接受一个逻辑值。 See ?"if" and ?ifelse看到?"if"?ifelse

For completeness: In big vectors, you can use the indices to speed things up (we do that often in simulations, where functions typically run 1000 to 10000 times).为了完整性:在大向量中,您可以使用索引来加快速度(我们经常在模拟中这样做,其中函数通常运行 1000 到 10000 次)。 But as long as it isn't necessary, just use ifelse .但只要没有必要,就使用ifelse This reads a lot easier.这样读起来容易很多。

> set.seed(100)
> x <- runif(1000,1,10)

> system.time(replicate(10000,{
+     y <- ifelse(x < 5,1,2)
+ }))
   user  system elapsed 
   2.56    0.08    2.64 

> system.time(replicate(10000,{
+   y <- rep(2,length(x))
+   y[x < 5]<- 1
+ }))
   user  system elapsed 
   0.48    0.00    0.48 

You could also just create a logical vector and 1 to it你也可以只创建一个逻辑向量和 1

x <- seq(0.1, 10, 0.1) # Your data set   
(x >= 5) + 1
#  [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
# [92] 2 2 2 2 2 2 2 2 2

If would like to compare performance, it would be the fastest solution如果要比较性能,这将是最快的解决方案

set.seed(100)
x <- runif(1e6, 1, 10)

RL <- function(x) y <- ifelse(x < 5,1,2)
JM <- function(x) {y <- rep(2, length(x)); y[x < 5] <- 1}
DA <- function(x) y <- (x >= 5) + 1

library(microbenchmark)
microbenchmark(RL(x),
               JM(x),
               DA(x))

# Unit: milliseconds
#  expr       min        lq      mean    median        uq       max neval
# RL(x) 331.83448 366.52940 378.89182 374.99741 381.08659 609.21218   100
# JM(x)  38.72894  42.18745  44.36493  43.25086  44.09626  82.76168   100
# DA(x)  10.01644  11.96482  14.21593  13.17825  14.12930  53.76923   100

Following the above post you can even use and modify the elements of a vector satisfying the criteria.按照上面的帖子,您甚至可以使用和修改满足条件的向量的元素。 In my opinion if it's not more costly to compute faster one should always do it.在我看来,如果计算速度不是更高,那么应该总是这样做。

x = seq(0.1,10,0.1)
y <- rep(2,length(x))
y[x<5] <- x[x<5]*2

The code of the previous post is best to answer the question.上一个帖子的代码最好回答这个问题。 But if I had to use the code above I would do:但如果我必须使用上面的代码,我会这样做:

x = seq(0.1,10,0.1)
y <- rep(2,length(x))
y[x<5] <- x[x<5]*0 +1
nzMean <- function(x) { mean(x[x!=-1],na.rm=TRUE)}

nzMin <- function(x) {min(x[x!=-1],na.rm=TRUE)}

nzMax <- function(x) { max(x[x!=-1],na.rm=TRUE)}

nzRange<-function(x) {nzMax(x)-nzMin(x)}

nzSD <- function(x) { SD(x[x!=-1],na.rm=TRUE)}

#following function works
nzN1<- function(x) {ifelse(x!=-1,(x-nzMin(x))/nzRange(x) ,x) }

#following is bad as it returns only 4 not 5 elements of vector
nzN2<- function(x) {ifelse(x!=-1,(x[x!=-1]-nzMin(x))/nzRange(x) ,x) }

#following is bad as it returns 5 elements of vector but not correct answer
nzN3<- function(x) {ifelse(x!=-1,(x[x!=-1]-nzMin(x))/nzRange(x) ,-1) }

y<-c(1,-1,-20,2,4)
a<-nzMean(y)
b<-nzMin(y)
c<-nzMax(y)
d<-nzRange(y)
# test the working function
z<-nzN1(y)

print(z)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM