简体   繁体   中英

ifelse didn't work in dataframe in R

I have a question about ifelse in data.frame in R . I checked several SO posts about it, and unfortunately none of these solutions fitted my case.

My case is, making a conditional calculation in a data frame, but it returns the condition has length > 1 and only the first element will be used even after I used ifelse function in R , which should work perfectly according to the SO posts I checked.

Here is my sample code:

library(scales)
head(temp[, 2:3])
  previous current
1        0      10
2       50      57
3       92     177
4       84     153
5       30      68
6      162     341
temp$change = ifelse(temp$previous > 0, rate(temp$previous, temp$current), temp$current)
rate = function(yest, tod){
  value = tod/yest
  if(value>1){
    return(paste("+", percent(value-1), sep = ""))
  }
  else{
    return(paste("-", percent(1-value), sep = ""))
  }
}

So if I run the ifelse one, I will get following result:

head(temp[, 2:4])
  previous current change
1        0      10     10
2       50      57  +NaN%
3       92     177  +NaN%
4       84     153  +NaN%
5       30      68  +NaN%
6      162     341  +NaN%

So my question is, how should I deal with it? I tried to assign 0 to the last column before I run ifelse , but it still failed.

Many thanks in advance!

Here's another way to do the same

# 1: load dplyr
#if needed install.packages("dplyr")
library(dplyr)

# 2: I recreate your data
your_dataframe = as_tibble(cbind(c(0,50,92,84,30,162),
                                 c(10,57,177,153,68,341))) %>% 
  rename(previous = V1, current = V2)

# 3: obtain the change using your conditions
your_dataframe %>% 
  mutate(change = ifelse(previous > 0,
                         ifelse(current/previous > 1,
                                paste0("+%", (current/previous-1)*100),
                                paste0("-%", (current/previous-1)*100)), 
                         current))

Result:

# A tibble: 6 x 3
  previous current             change
     <dbl>   <dbl>              <chr>
1        0      10                 10
2       50      57               +%14
3       92     177 +%92.3913043478261
4       84     153 +%82.1428571428571
5       30      68 +%126.666666666667
6      162     341 +%110.493827160494

Try the following two segments, both should does what you wanted. May be it is the second one you are looking for.

library(scales)
set.seed(1)
temp <- data.frame(previous = rnorm(5), current = rnorm(5))
rate <- function(i) {
  yest <- temp$previous[i] 
  tod <- temp$current[i]
  if (yest <= 0)
    return(tod)
  value = tod/yest
 if (value>1) {
   return(paste("+", percent(value-1), sep = ""))
 } else {
   return(paste("-", percent(1-value), sep = ""))
 }
}

temp$change <- unlist(lapply(1:dim(temp)[1], rate))

Second:

ind <- which(temp$previous > 0)
temp$change <- temp$current
temp$change[ind] <- unlist(lapply(ind, 
                      function(i)  rate(temp$previous[i], temp$current[i])))

In the second segment, the function rate is same as you've coded it.

Only the first element in value is evaluated. So, the output of rate solely depend on the first row of temp .

Adopting the advice I received from warm-hearted SO users, I vectorized some of my functions and it worked! Raise a glass to SO community!

Here is the solution:

temp$rate = ifelse(temp$previous > 0, ifelse(temp$current/temp$previous > 1, 
                                             temp$current/temp$previous - 1, 
                                             1 - temp$current/temp$previous), 
                   temp$current)

This will return rate with scientific notation. If "regular" notation is needed, here is an update:

temp$rate = format(temp$rate, scientific = F)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM