在 2 种条件下变异 3 列碱基

Question

有人知道一种更有效的方法来运行此代码以根据与特定列相关的条件从 NA 中的 3 列转换值。 例如用 mutate_at 代替 mutate。

Data = DATA %>%  
  mutate(Temperature1 = ifelse(Temperature1 < 19 & Cyclon1== "f","NA",Temperature1 )) %>% 
  mutate(Temperature2 = ifelse(Temperature2 < 19 & Cyclon2== "f","NA",Temperature2 )) %>%
  mutate(Temperature3 = ifelse(Temperature3 < 19 & Cyclon3== "f","NA",Temperature3 ))

提前致谢

Answer 1

这不是那么简单，因为您需要将 Temperature1 与 Cyclon1 匹配，如果您想坚持 dplyr，那么出路是先到 pivot 更长的时间，变异和 pivot 回来。 例如，如果您的数据是这样的：

set.seed(111)
DATA = data.frame(Temperature1=runif(100,min=0,max=100),
Temperature2=runif(100,min=0,max=100),
Temperature3=runif(100,min=0,max=100),
Cyclon1 = sample(c("t","f"),100,replace=TRUE),
Cyclon2 = sample(c("t","f"),100,replace=TRUE),
Cyclon3 = sample(c("t","f"),100,replace=TRUE))

然后我们这样做：

DATA %>%  rownames_to_column("id") %>% 
pivot_longer(-id,names_to=c(".value","set"),names_pattern="([^0-9]*)([0-9])")

# A tibble: 300 x 4
   id    set   Temperature Cyclon
   <chr> <chr>       <dbl> <fct> 
 1 1     1            59.3 t     
 2 1     2            57.6 f     
 3 1     3            72.6 t     
 4 2     1            72.6 t     
 5 2     2            13.6 t     
 6 2     3            92.0 f

在此步骤中，对于每个组 (1-3)，您都有一个相应的 Cyclon 和温度，剩下的就是让您再次变异和 pivot 宽：

data1 = DATA %>%  rownames_to_column("id") %>% 
pivot_longer(-id,names_to=c(".value","set"),names_pattern="([^0-9]*)([0-9])") %>%   
mutate(Temperature=replace(Temperature,Temperature < 19 & Cyclon== "f",NA)) %>%
pivot_wider(values_from=c(Temperature,Cyclon),names_from=set)

我们可以检查这些值：

head(DATA[DATA$Temperature1 < 19 & DATA$Cyclon1=="f",])
   Temperature1 Temperature2 Temperature3 Cyclon1 Cyclon2 Cyclon3
7      1.065785     64.00623     58.11568       f       t       t
10     9.368152     96.53025     53.62925       f       t       t
14     4.754785     90.39043     47.44193       f       f       f
15    15.620252     96.45305     72.74062       f       t       f
17    17.144369     54.89127     95.85764       f       t       f
31     5.859646     35.14933     44.92498       f       f       t

head(data1[DATA$Temperature1 < 19 & DATA$Cyclon1=="f",])
# A tibble: 6 x 7
  id    Temperature_1 Temperature_2 Temperature_3 Cyclon_1 Cyclon_2 Cyclon_3
  <chr>         <dbl>         <dbl>         <dbl> <fct>    <fct>    <fct>   
1 7                NA          64.0          58.1 f        t        t       
2 10               NA          96.5          53.6 f        t        t       
3 14               NA          90.4          47.4 f        f        f       
4 15               NA          96.5          72.7 f        t        f       
5 17               NA          54.9          95.9 f        t        f       
6 31               NA          35.1          44.9 f        f        t

Answer 2

我假设了一些数据：

DATA <- tibble(Record = LETTERS[1:6],
                Temperature1 = c(17:22),
                Cyclon1 = rep(c("f", "g"), 3),
                Temperature2 = c(17:22),
                Cyclon2 = rep(c("f", "g"), 3),
                Temperature3 = c(17:22),
                Cyclon3 = rep(c("f", "g"), 3))

我gather ed，然后mutate d （因为我的 R 安装还没有pivot long ）

LONGDATA <- DATA %>% 
  gather("Cyclon", "cValue", starts_with("Cyclon")) %>% 
  gather("Temperature", "tValue", starts_with("Temperature")) %>% 
    # Here's where the logic is.
  mutate(tValue = ifelse(tValue < 19 & cValue == "f", "NA", tValue ))

LONGDATA
# A tibble: 54 x 5
   Record Cyclon  cValue Temperature  tValue
   <chr>  <chr>   <chr>  <chr>        <chr> 
 1 A      Cyclon1 f      Temperature1 NA    
 2 B      Cyclon1 g      Temperature1 18    
 3 C      Cyclon1 f      Temperature1 19    
 4 D      Cyclon1 g      Temperature1 20    
 5 E      Cyclon1 f      Temperature1 21    
 6 F      Cyclon1 g      Temperature1 22    
 7 A      Cyclon2 f      Temperature1 NA    
 8 B      Cyclon2 g      Temperature1 18    
 9 C      Cyclon2 f      Temperature1 19    
10 D      Cyclon2 g      Temperature1 20

我个人会将其保留为 LONGDATA 形式。 但如果你真的想要你的宽风格回来......

NEWDATA <- LONGDATA %>% 
  spread(key = Cyclon, value = cValue) %>% 
  spread(key = Temperature, value = tValue)

 NEWDATA
# A tibble: 6 x 7
  Record Cyclon1 Cyclon2 Cyclon3 Temperature1 Temperature2 Temperature3
  <chr>  <chr>   <chr>   <chr>   <chr>        <chr>        <chr>       
1 A      f       f       f       NA           NA           NA          
2 B      g       g       g       18           18           18          
3 C      f       f       f       19           19           19          
4 D      g       g       g       20           20           20          
5 E      f       f       f       21           21           21          
6 F      g       g       g       22           22           22

在 2 种条件下变异 3 列碱基

问题描述

2 个解决方案

解决方案1
3 已采纳 2020-05-16 16:34:29

解决方案2
1 2020-05-16 16:48:17

在 2 种条件下变异 3 列碱基

问题描述

2 个解决方案

解决方案1 3 已采纳 2020-05-16 16:34:29

解决方案2 1 2020-05-16 16:48:17

解决方案1
3 已采纳 2020-05-16 16:34:29

解决方案2
1 2020-05-16 16:48:17