简体   繁体   中英

mutate_at (or across) and ifelse statement

Similar to this question , given tmpp :

library(data.table)
library(tidyverse)
tmpp <- data.table(
  "ID" = c(1,1,1,2,2), 
  "Date" = c(1,2,3,1,2), 
  "total_neg" = c(1,1,0,0,2),
  "total_pos" = c(4,5,2,4,5),
  "H1" = c(5,4,0,5,-5),
  "H2" = c(5,-10,5,5,-5),
  "H3" = c(-10,6,5,0,10)
)
tmpp
#    ID Date total_neg total_pos H1  H2  H3
# 1:  1    1         1         4  5   5 -10
# 2:  1    2         1         5  4 -10   6
# 3:  1    3         0         2  0   5   5
# 4:  2    1         0         4  5   5   0
# 5:  2    2         2         5 -5  -5  10

I want to replace all variables starting with H , with NA where total_neg == 1 :

#    ID Date total_neg total_pos H1  H2  H3
# 1:  1    1         1         4  NA NA NA
# 2:  1    2         1         5  NA NA NA
# 3:  1    3         0         2  0   5   5
# 4:  2    1         0         4  5   5   0
# 5:  2    2         2         5 -5  -5  10

Why don't these work?

tmpp %>%
  mutate_at(vars(matches("H")), ~ifelse( .$total_neg == 1, NA, .))

tmpp %>%
  mutate_at(vars(matches("H"),
            .funs = list(~ ifelse(.$total_neg == 1, NA, .))))
#im guessing the first dot in the ifelse statements above is referring to the H columns so I tried:
tmpp %>%
  mutate_at(vars(matches("H"),
                 .funs = list(~ ifelse(tmpp$total_neg == 1, NA, .))))

Happy to see across version too, thanks

A simple data.table solution that updates all the columns at once & in-place only for the subset

tmpp[total_neg == 1, grep("^H", names(tmpp)) := NA]
tmpp
#    ID Date total_neg total_pos H1 H2 H3
# 1:  1    1         1         4 NA NA NA
# 2:  1    2         1         5 NA NA NA
# 3:  1    3         0         2  0  5  5
# 4:  2    1         0         4  5  5  0
# 5:  2    2         2         5 -5 -5 10

You don't need to use $ in dplyr pipe. In mutate_at / across it refers to column value. Try :

library(dplyr)
tmpp %>% mutate(across(starts_with('H'), ~replace(., total_neg == 1, NA)))

#   ID Date total_neg total_pos H1 H2 H3
#1:  1    1         1         4 NA NA NA
#2:  1    2         1         5 NA NA NA
#3:  1    3         0         2  0  5  5
#4:  2    1         0         4  5  5  0
#5:  2    2         2         5 -5 -5 10

Your guess is correct: inside the purrr-style anonymous function (after your ~ ), . refers to the function argument, which is a single column , not the data frame you piped in. The solution is to simplify by removing the .$ .

tmpp %>%
  mutate_at(vars(matches("H")), ~ifelse(total_neg == 1, NA, .))
#    ID Date total_neg total_pos H1 H2 H3
# 1:  1    1         1         4 NA NA NA
# 2:  1    2         1         5 NA NA NA
# 3:  1    3         0         2  0  5  5
# 4:  2    1         0         4  5  5  0
# 5:  2    2         2         5 -5 -5 10

If you want to modify "all variables starting with H" , I'd strongly suggest using starts_with("H") rather than matches("H") .

Maybe you can use starts_with() inside across() . Here the code:

library(data.table)
library(tidyverse)
tmpp <- data.table(
  "ID" = c(1,1,1,2,2), 
  "Date" = c(1,2,3,1,2), 
  "total_neg" = c(1,1,0,0,2),
  "total_pos" = c(4,5,2,4,5),
  "H1" = c(5,4,0,5,-5),
  "H2" = c(5,-10,5,5,-5),
  "H3" = c(-10,6,5,0,10)
)
#Code
tmpp %>% 
  mutate(across(starts_with('H'),~ifelse(total_neg==1,NA,.)))

Output:

   ID Date total_neg total_pos H1 H2 H3
1:  1    1         1         4 NA NA NA
2:  1    2         1         5 NA NA NA
3:  1    3         0         2  0  5  5
4:  2    1         0         4  5  5  0
5:  2    2         2         5 -5 -5 10

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM