在 2 种条件下变异 3 列碱基

Question

Do someone know a more efficient way to run this code to tranforme values from 3 colums in NA depending on a condition related with a specific column.有人知道一种更有效的方法来运行此代码以根据与特定列相关的条件从 NA 中的 3 列转换值。 For example with mutate_at instead of mutate.例如用 mutate_at 代替 mutate。

Data = DATA %>%  
  mutate(Temperature1 = ifelse(Temperature1 < 19 & Cyclon1== "f","NA",Temperature1 )) %>% 
  mutate(Temperature2 = ifelse(Temperature2 < 19 & Cyclon2== "f","NA",Temperature2 )) %>%
  mutate(Temperature3 = ifelse(Temperature3 < 19 & Cyclon3== "f","NA",Temperature3 ))

Thanks in advance提前致谢

Answer 1

It's not so straight forward because you need to match Temperature1 with Cyclon1, if you want to stick to dplyr, then the way out is to pivot longer first, mutate and pivot back.这不是那么简单，因为您需要将 Temperature1 与 Cyclon1 匹配，如果您想坚持 dplyr，那么出路是先到 pivot 更长的时间，变异和 pivot 回来。 For example if your data is like this:例如，如果您的数据是这样的：

set.seed(111)
DATA = data.frame(Temperature1=runif(100,min=0,max=100),
Temperature2=runif(100,min=0,max=100),
Temperature3=runif(100,min=0,max=100),
Cyclon1 = sample(c("t","f"),100,replace=TRUE),
Cyclon2 = sample(c("t","f"),100,replace=TRUE),
Cyclon3 = sample(c("t","f"),100,replace=TRUE))

Then we do:然后我们这样做：

DATA %>%  rownames_to_column("id") %>% 
pivot_longer(-id,names_to=c(".value","set"),names_pattern="([^0-9]*)([0-9])")

# A tibble: 300 x 4
   id    set   Temperature Cyclon
   <chr> <chr>       <dbl> <fct> 
 1 1     1            59.3 t     
 2 1     2            57.6 f     
 3 1     3            72.6 t     
 4 2     1            72.6 t     
 5 2     2            13.6 t     
 6 2     3            92.0 f

At this step, for every group (1-3) you have a corresponding Cyclon and Temperature, what remains is for you to mutate and pivot wide again:在此步骤中，对于每个组 (1-3)，您都有一个相应的 Cyclon 和温度，剩下的就是让您再次变异和 pivot 宽：

data1 = DATA %>%  rownames_to_column("id") %>% 
pivot_longer(-id,names_to=c(".value","set"),names_pattern="([^0-9]*)([0-9])") %>%   
mutate(Temperature=replace(Temperature,Temperature < 19 & Cyclon== "f",NA)) %>%
pivot_wider(values_from=c(Temperature,Cyclon),names_from=set)

We can check the values:我们可以检查这些值：

head(DATA[DATA$Temperature1 < 19 & DATA$Cyclon1=="f",])
   Temperature1 Temperature2 Temperature3 Cyclon1 Cyclon2 Cyclon3
7      1.065785     64.00623     58.11568       f       t       t
10     9.368152     96.53025     53.62925       f       t       t
14     4.754785     90.39043     47.44193       f       f       f
15    15.620252     96.45305     72.74062       f       t       f
17    17.144369     54.89127     95.85764       f       t       f
31     5.859646     35.14933     44.92498       f       f       t

head(data1[DATA$Temperature1 < 19 & DATA$Cyclon1=="f",])
# A tibble: 6 x 7
  id    Temperature_1 Temperature_2 Temperature_3 Cyclon_1 Cyclon_2 Cyclon_3
  <chr>         <dbl>         <dbl>         <dbl> <fct>    <fct>    <fct>   
1 7                NA          64.0          58.1 f        t        t       
2 10               NA          96.5          53.6 f        t        t       
3 14               NA          90.4          47.4 f        f        f       
4 15               NA          96.5          72.7 f        t        f       
5 17               NA          54.9          95.9 f        t        f       
6 31               NA          35.1          44.9 f        f        t

Answer 2

I assumed some data:我假设了一些数据：

DATA <- tibble(Record = LETTERS[1:6],
                Temperature1 = c(17:22),
                Cyclon1 = rep(c("f", "g"), 3),
                Temperature2 = c(17:22),
                Cyclon2 = rep(c("f", "g"), 3),
                Temperature3 = c(17:22),
                Cyclon3 = rep(c("f", "g"), 3))

I gather ed, then mutate d (Because my R installation doesn't have pivot long yet)我gather ed，然后mutate d （因为我的 R 安装还没有pivot long ）

LONGDATA <- DATA %>% 
  gather("Cyclon", "cValue", starts_with("Cyclon")) %>% 
  gather("Temperature", "tValue", starts_with("Temperature")) %>% 
    # Here's where the logic is.
  mutate(tValue = ifelse(tValue < 19 & cValue == "f", "NA", tValue ))

LONGDATA
# A tibble: 54 x 5
   Record Cyclon  cValue Temperature  tValue
   <chr>  <chr>   <chr>  <chr>        <chr> 
 1 A      Cyclon1 f      Temperature1 NA    
 2 B      Cyclon1 g      Temperature1 18    
 3 C      Cyclon1 f      Temperature1 19    
 4 D      Cyclon1 g      Temperature1 20    
 5 E      Cyclon1 f      Temperature1 21    
 6 F      Cyclon1 g      Temperature1 22    
 7 A      Cyclon2 f      Temperature1 NA    
 8 B      Cyclon2 g      Temperature1 18    
 9 C      Cyclon2 f      Temperature1 19    
10 D      Cyclon2 g      Temperature1 20

Personally I'd leave it in LONGDATA form.我个人会将其保留为 LONGDATA 形式。 But if you really want your wide style back...但如果你真的想要你的宽风格回来......

NEWDATA <- LONGDATA %>% 
  spread(key = Cyclon, value = cValue) %>% 
  spread(key = Temperature, value = tValue)

 NEWDATA
# A tibble: 6 x 7
  Record Cyclon1 Cyclon2 Cyclon3 Temperature1 Temperature2 Temperature3
  <chr>  <chr>   <chr>   <chr>   <chr>        <chr>        <chr>       
1 A      f       f       f       NA           NA           NA          
2 B      g       g       g       18           18           18          
3 C      f       f       f       19           19           19          
4 D      g       g       g       20           20           20          
5 E      f       f       f       21           21           21          
6 F      g       g       g       22           22           22

在 2 种条件下变异 3 列碱基

问题描述

2 个解决方案

解决方案1
3 已采纳 2020-05-16 16:34:29

解决方案2
1 2020-05-16 16:48:17

在 2 种条件下变异 3 列碱基

问题描述

2 个解决方案

解决方案1 3 已采纳 2020-05-16 16:34:29

解决方案2 1 2020-05-16 16:48:17

解决方案1
3 已采纳 2020-05-16 16:34:29

解决方案2
1 2020-05-16 16:48:17