简体   繁体   中英

How to use mutate and ifelse in a loop?

What I do is to create dummies to indicate whether a continuous variable exceeds a certain threshold (1) or is below this threshold (0). I achieved this via several repetitive mutates, which I would like to substitute with a loop.

# load tidyverse
library(tidyverse)

# create data
data <- data.frame(x = runif(1:100, min=0, max=100))

# What I do
data <- data %>%
  mutate(x20 = ifelse(x >= 20, 1, 0)) %>%
  mutate(x40 = ifelse(x >= 40, 1, 0)) %>%
  mutate(x60 = ifelse(x >= 60, 1, 0)) %>%
  mutate(x80 = ifelse(x >= 80, 1, 0)) 
  
# What I would like to do
for (i in seq(from=0, to=100, by=20)){
  data %>% mutate(paste(x,i) = ifelse(x >= i, 1,0))
}

Thank you.

You can use map_dfc here:

library(dplyr)
library(purrr)
breaks <- seq(from=0, to=100, by=20)
bind_cols(data, map_dfc(breaks, ~
          data %>% transmute(!!paste0('x', .x) := as.integer(x > .x))))

#             x x0 x20 x40 x60 x80 x100
#1    6.2772517  1   0   0   0   0    0
#2   16.3520358  1   0   0   0   0    0
#3   25.8958212  1   1   0   0   0    0
#4   78.9354970  1   1   1   1   0    0
#5   35.7731737  1   1   0   0   0    0
#6    5.7395139  1   0   0   0   0    0
#7   49.7069551  1   1   1   0   0    0
#8   53.5134559  1   1   1   0   0    0
#...
#....     

Although, I think it is much simpler in base R:

data[paste0('x', breaks)] <- lapply(breaks, function(x) as.integer(data$x > x))

You can use reduce() in purrr .

library(dplyr)
library(purrr)

reduce(seq(0, 100, by = 20), .init = data,
       ~ mutate(.x, !!paste0('x', .y) := as.integer(x >= .y)))

#             x x0 x20 x40 x60 x80 x100
# 1   61.080545  1   1   1   1   0    0
# 2   63.036673  1   1   1   1   0    0
# 3   71.064322  1   1   1   1   0    0
# 4    1.821416  1   0   0   0   0    0
# 5   24.721454  1   1   0   0   0    0

The corresponding base way with Reduce() :

Reduce(function(df, y){ df[paste0('x', y)] <- as.integer(df$x >= y); df },
       seq(0, 100, by = 20), data)

Ronak's base R is probably the best, but for completeness here's another way similar to how you were originally doing it, just with dplyr:

for (i in seq(from=0, to=100, by=20)){
    var <- paste0('x',i)
    data <- mutate(data, !!var := ifelse(x >= i, 1,0))
}

          x x0 x20 x40 x60 x80 x100
1 99.735037  1   1   1   1   1    0
2  9.075226  1   0   0   0   0    0
3 73.786282  1   1   1   1   0    0
4 89.744719  1   1   1   1   1    0
5 34.139207  1   1   0   0   0    0
6 88.138611  1   1   1   1   1    0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM