[英]Mutate 3 colums base in 2 conditions
有人知道一种更有效的方法来运行此代码以根据与特定列相关的条件从 NA 中的 3 列转换值。 例如用 mutate_at 代替 mutate。
Data = DATA %>%
mutate(Temperature1 = ifelse(Temperature1 < 19 & Cyclon1== "f","NA",Temperature1 )) %>%
mutate(Temperature2 = ifelse(Temperature2 < 19 & Cyclon2== "f","NA",Temperature2 )) %>%
mutate(Temperature3 = ifelse(Temperature3 < 19 & Cyclon3== "f","NA",Temperature3 ))
提前致谢
这不是那么简单,因为您需要将 Temperature1 与 Cyclon1 匹配,如果您想坚持 dplyr,那么出路是先到 pivot 更长的时间,变异和 pivot 回来。 例如,如果您的数据是这样的:
set.seed(111)
DATA = data.frame(Temperature1=runif(100,min=0,max=100),
Temperature2=runif(100,min=0,max=100),
Temperature3=runif(100,min=0,max=100),
Cyclon1 = sample(c("t","f"),100,replace=TRUE),
Cyclon2 = sample(c("t","f"),100,replace=TRUE),
Cyclon3 = sample(c("t","f"),100,replace=TRUE))
然后我们这样做:
DATA %>% rownames_to_column("id") %>%
pivot_longer(-id,names_to=c(".value","set"),names_pattern="([^0-9]*)([0-9])")
# A tibble: 300 x 4
id set Temperature Cyclon
<chr> <chr> <dbl> <fct>
1 1 1 59.3 t
2 1 2 57.6 f
3 1 3 72.6 t
4 2 1 72.6 t
5 2 2 13.6 t
6 2 3 92.0 f
在此步骤中,对于每个组 (1-3),您都有一个相应的 Cyclon 和温度,剩下的就是让您再次变异和 pivot 宽:
data1 = DATA %>% rownames_to_column("id") %>%
pivot_longer(-id,names_to=c(".value","set"),names_pattern="([^0-9]*)([0-9])") %>%
mutate(Temperature=replace(Temperature,Temperature < 19 & Cyclon== "f",NA)) %>%
pivot_wider(values_from=c(Temperature,Cyclon),names_from=set)
我们可以检查这些值:
head(DATA[DATA$Temperature1 < 19 & DATA$Cyclon1=="f",])
Temperature1 Temperature2 Temperature3 Cyclon1 Cyclon2 Cyclon3
7 1.065785 64.00623 58.11568 f t t
10 9.368152 96.53025 53.62925 f t t
14 4.754785 90.39043 47.44193 f f f
15 15.620252 96.45305 72.74062 f t f
17 17.144369 54.89127 95.85764 f t f
31 5.859646 35.14933 44.92498 f f t
head(data1[DATA$Temperature1 < 19 & DATA$Cyclon1=="f",])
# A tibble: 6 x 7
id Temperature_1 Temperature_2 Temperature_3 Cyclon_1 Cyclon_2 Cyclon_3
<chr> <dbl> <dbl> <dbl> <fct> <fct> <fct>
1 7 NA 64.0 58.1 f t t
2 10 NA 96.5 53.6 f t t
3 14 NA 90.4 47.4 f f f
4 15 NA 96.5 72.7 f t f
5 17 NA 54.9 95.9 f t f
6 31 NA 35.1 44.9 f f t
我假设了一些数据:
DATA <- tibble(Record = LETTERS[1:6],
Temperature1 = c(17:22),
Cyclon1 = rep(c("f", "g"), 3),
Temperature2 = c(17:22),
Cyclon2 = rep(c("f", "g"), 3),
Temperature3 = c(17:22),
Cyclon3 = rep(c("f", "g"), 3))
我gather
ed,然后mutate
d (因为我的 R 安装还没有pivot long
)
LONGDATA <- DATA %>%
gather("Cyclon", "cValue", starts_with("Cyclon")) %>%
gather("Temperature", "tValue", starts_with("Temperature")) %>%
# Here's where the logic is.
mutate(tValue = ifelse(tValue < 19 & cValue == "f", "NA", tValue ))
LONGDATA
# A tibble: 54 x 5
Record Cyclon cValue Temperature tValue
<chr> <chr> <chr> <chr> <chr>
1 A Cyclon1 f Temperature1 NA
2 B Cyclon1 g Temperature1 18
3 C Cyclon1 f Temperature1 19
4 D Cyclon1 g Temperature1 20
5 E Cyclon1 f Temperature1 21
6 F Cyclon1 g Temperature1 22
7 A Cyclon2 f Temperature1 NA
8 B Cyclon2 g Temperature1 18
9 C Cyclon2 f Temperature1 19
10 D Cyclon2 g Temperature1 20
我个人会将其保留为 LONGDATA 形式。 但如果你真的想要你的宽风格回来......
NEWDATA <- LONGDATA %>%
spread(key = Cyclon, value = cValue) %>%
spread(key = Temperature, value = tValue)
NEWDATA
# A tibble: 6 x 7
Record Cyclon1 Cyclon2 Cyclon3 Temperature1 Temperature2 Temperature3
<chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 A f f f NA NA NA
2 B g g g 18 18 18
3 C f f f 19 19 19
4 D g g g 20 20 20
5 E f f f 21 21 21
6 F g g g 22 22 22
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.