简体   繁体   中英

Mutate 3 colums base in 2 conditions

Do someone know a more efficient way to run this code to tranforme values from 3 colums in NA depending on a condition related with a specific column. For example with mutate_at instead of mutate.

Data = DATA %>%  
  mutate(Temperature1 = ifelse(Temperature1 < 19 & Cyclon1== "f","NA",Temperature1 )) %>% 
  mutate(Temperature2 = ifelse(Temperature2 < 19 & Cyclon2== "f","NA",Temperature2 )) %>%
  mutate(Temperature3 = ifelse(Temperature3 < 19 & Cyclon3== "f","NA",Temperature3 ))

Thanks in advance

It's not so straight forward because you need to match Temperature1 with Cyclon1, if you want to stick to dplyr, then the way out is to pivot longer first, mutate and pivot back. For example if your data is like this:

set.seed(111)
DATA = data.frame(Temperature1=runif(100,min=0,max=100),
Temperature2=runif(100,min=0,max=100),
Temperature3=runif(100,min=0,max=100),
Cyclon1 = sample(c("t","f"),100,replace=TRUE),
Cyclon2 = sample(c("t","f"),100,replace=TRUE),
Cyclon3 = sample(c("t","f"),100,replace=TRUE))

Then we do:

DATA %>%  rownames_to_column("id") %>% 
pivot_longer(-id,names_to=c(".value","set"),names_pattern="([^0-9]*)([0-9])")

# A tibble: 300 x 4
   id    set   Temperature Cyclon
   <chr> <chr>       <dbl> <fct> 
 1 1     1            59.3 t     
 2 1     2            57.6 f     
 3 1     3            72.6 t     
 4 2     1            72.6 t     
 5 2     2            13.6 t     
 6 2     3            92.0 f  

At this step, for every group (1-3) you have a corresponding Cyclon and Temperature, what remains is for you to mutate and pivot wide again:

data1 = DATA %>%  rownames_to_column("id") %>% 
pivot_longer(-id,names_to=c(".value","set"),names_pattern="([^0-9]*)([0-9])") %>%   
mutate(Temperature=replace(Temperature,Temperature < 19 & Cyclon== "f",NA)) %>%
pivot_wider(values_from=c(Temperature,Cyclon),names_from=set)

We can check the values:

head(DATA[DATA$Temperature1 < 19 & DATA$Cyclon1=="f",])
   Temperature1 Temperature2 Temperature3 Cyclon1 Cyclon2 Cyclon3
7      1.065785     64.00623     58.11568       f       t       t
10     9.368152     96.53025     53.62925       f       t       t
14     4.754785     90.39043     47.44193       f       f       f
15    15.620252     96.45305     72.74062       f       t       f
17    17.144369     54.89127     95.85764       f       t       f
31     5.859646     35.14933     44.92498       f       f       t

head(data1[DATA$Temperature1 < 19 & DATA$Cyclon1=="f",])
# A tibble: 6 x 7
  id    Temperature_1 Temperature_2 Temperature_3 Cyclon_1 Cyclon_2 Cyclon_3
  <chr>         <dbl>         <dbl>         <dbl> <fct>    <fct>    <fct>   
1 7                NA          64.0          58.1 f        t        t       
2 10               NA          96.5          53.6 f        t        t       
3 14               NA          90.4          47.4 f        f        f       
4 15               NA          96.5          72.7 f        t        f       
5 17               NA          54.9          95.9 f        t        f       
6 31               NA          35.1          44.9 f        f        t       

I assumed some data:

DATA <- tibble(Record = LETTERS[1:6],
                Temperature1 = c(17:22),
                Cyclon1 = rep(c("f", "g"), 3),
                Temperature2 = c(17:22),
                Cyclon2 = rep(c("f", "g"), 3),
                Temperature3 = c(17:22),
                Cyclon3 = rep(c("f", "g"), 3))

I gather ed, then mutate d (Because my R installation doesn't have pivot long yet)

LONGDATA <- DATA %>% 
  gather("Cyclon", "cValue", starts_with("Cyclon")) %>% 
  gather("Temperature", "tValue", starts_with("Temperature")) %>% 
    # Here's where the logic is.
  mutate(tValue = ifelse(tValue < 19 & cValue == "f", "NA", tValue ))

LONGDATA
# A tibble: 54 x 5
   Record Cyclon  cValue Temperature  tValue
   <chr>  <chr>   <chr>  <chr>        <chr> 
 1 A      Cyclon1 f      Temperature1 NA    
 2 B      Cyclon1 g      Temperature1 18    
 3 C      Cyclon1 f      Temperature1 19    
 4 D      Cyclon1 g      Temperature1 20    
 5 E      Cyclon1 f      Temperature1 21    
 6 F      Cyclon1 g      Temperature1 22    
 7 A      Cyclon2 f      Temperature1 NA    
 8 B      Cyclon2 g      Temperature1 18    
 9 C      Cyclon2 f      Temperature1 19    
10 D      Cyclon2 g      Temperature1 20   

Personally I'd leave it in LONGDATA form. But if you really want your wide style back...

NEWDATA <- LONGDATA %>% 
  spread(key = Cyclon, value = cValue) %>% 
  spread(key = Temperature, value = tValue)

 NEWDATA
# A tibble: 6 x 7
  Record Cyclon1 Cyclon2 Cyclon3 Temperature1 Temperature2 Temperature3
  <chr>  <chr>   <chr>   <chr>   <chr>        <chr>        <chr>       
1 A      f       f       f       NA           NA           NA          
2 B      g       g       g       18           18           18          
3 C      f       f       f       19           19           19          
4 D      g       g       g       20           20           20          
5 E      f       f       f       21           21           21          
6 F      g       g       g       22           22           22          

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM