[英]r dplyr mutate_if multiple conditions
I realise this question has been asked previously, but I can't seem to get the code to work.我意识到之前已经问过这个问题,但我似乎无法让代码正常工作。
Here is my data:这是我的数据:
structure(list(ph503_3 = c(-1, -1, -1, 0, -1, -1), gripstrength = c(33,
40, 26, 30, 49, 31), IPAQmetminutes = c(5196, 198, 1674, 642,
11724, 1155), tugtimesec = c(8, 7, 7, 17, 9, 8), MHcesd = c(1,
0, 1, 12, 0, 9), id = c("292221", "334262", "075822", "40642",
"274222", "245801"), age = c(58, 68, 54, 64, 52, 58), COGmmse = c(30,
27, 29, 27, 30, 29), DISconverse1 = c("None", "None", "None",
"None", "None", "None"), MDantidepressant = c("No", "No", "No",
"Yes", "No", "No"), MDantipark = c(0, 0, 0, 0, 0, 0), MDpolypharmacy = c(0,
0, 0, 1, 0, 0), W2socialclass = c("Skilled", "Semi-skilled",
"Managerial & Technical", "Non-Manual", "Skilled", "Managerial & Technical"
), bh201 = c("Would never doze", "Would never doze", "Slight chance of dozing",
"Would never doze", "High chance of dozing", "Would never doze"
), fl001_01 = c("NOT Walking 100 metres (100 yards)", "NOT Walking 100 metres (100 yards)",
"NOT Walking 100 metres (100 yards)", "Walking 100 metres (100 yards)",
"NOT Walking 100 metres (100 yards)", "NOT Walking 100 metres (100 yards)"
), fl001_02 = c("NOT Running or jogging about 1.5 kilometres (1 mile)",
"Running or jogging about 1.5 kilometres (1 mile)", "NOT Running or jogging about 1.5 kilometres (1 mile)",
"Running or jogging about 1.5 kilometres (1 mile)", "Running or jogging about 1.5 kilometres (1 mile)",
"NOT Running or jogging about 1.5 kilometres (1 mile)"), fl001_04 = c("NOT Getting up from a chair after sitting for long periods",
"NOT Getting up from a chair after sitting for long periods",
"NOT Getting up from a chair after sitting for long periods",
"Getting up from a chair after sitting for long periods", "NOT Getting up from a chair after sitting for long periods",
"NOT Getting up from a chair after sitting for long periods"),
fl001_05 = c("NOT Climbing several flights of stairs without resting",
"NOT Climbing several flights of stairs without resting",
"NOT Climbing several flights of stairs without resting",
"Climbing several flights of stairs without resting", "NOT Climbing several flights of stairs without resting",
"NOT Climbing several flights of stairs without resting"),
fl001_06 = c("NOT Climbing one flight of stairs without resting",
"NOT Climbing one flight of stairs without resting", "NOT Climbing one flight of stairs without resting",
"Climbing one flight of stairs without resting", "NOT Climbing one flight of stairs without resting",
"NOT Climbing one flight of stairs without resting"), fl001_07 = c("NOT Stooping, kneeling, or crouching",
"NOT Stooping, kneeling, or crouching", "NOT Stooping, kneeling, or crouching",
"Stooping, kneeling, or crouching", "NOT Stooping, kneeling, or crouching",
"NOT Stooping, kneeling, or crouching")), row.names = c(NA,
-6L), class = c("tbl_df", "tbl", "data.frame"))
My code:我的代码:
mydata <- mydata %>%
mutate_if(class(.)=="character" & str_detect(colnames(.), "^fl\\d|^ph\\d"), ~if_else(grepl("NOT ", .), 0, 1))
The code runs, but nothing happens and I get the following message when I knit the markdown:代码运行,但没有任何反应,当我编织降价时,我收到以下消息:
Warning message:
In class(.) == "character" & str_detect(colnames(.), "^fl\\d|^ph\\d") :
longer object length is not a multiple of shorter object length
The conditional needs to return multiple columns all at once, but reading class(.) == "character"
makes me believe you are checking one column at a time.条件需要一次返回多列,但阅读
class(.) == "character"
让我相信你一次检查一列。 The .
.
is replaced internally with the whole frame not individual columns/vectors.在内部替换为整个框架而不是单独的列/向量。 The second half of your conditional is correct, but the first:
你的条件的后半部分是正确的,但第一部分:
myfunc <- function(...) { browser(); TRUE; }
mydata %>% mutate_if(myfunc(.), ~ 1)
# Browse[2]>
list(...)
# [[1]]
# # A tibble: 6 x 20
# ph503_3 gripstrength IPAQmetminutes tugtimesec MHcesd id age COGmmse DISconverse1
# <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <dbl> <dbl> <chr>
# 1 -1 33 5196 8 1 2922~ 58 30 None
# 2 -1 40 198 7 0 3342~ 68 27 None
# 3 -1 26 1674 7 1 0758~ 54 29 None
# 4 0 30 642 17 12 40642 64 27 None
# 5 -1 49 11724 9 0 2742~ 52 30 None
# 6 -1 31 1155 8 9 2458~ 58 29 None
# # ... with 11 more variables: MDantidepressant <chr>, MDantipark <dbl>, MDpolypharmacy <dbl>,
# # W2socialclass <chr>, bh201 <chr>, fl001_01 <chr>, fl001_02 <chr>, fl001_04 <chr>,
# # fl001_05 <chr>, fl001_06 <chr>, fl001_07 <chr>
In that context, class(whole_data_frame) == "character"
does not make sense (by itself).在这种情况下,
class(whole_data_frame) == "character"
没有意义(就其本身而言)。
You can look for character columns using sapply(., is.character)
(or one of purrr
's equivalents):您可以使用
sapply(., is.character)
(或purrr
的等效项之一sapply(., is.character)
查找字符列:
mydata %>%
mutate_if(sapply(., is.character) &
stringr::str_detect(colnames(.), "^fl\\d|^ph\\d"),
~ +(!grepl("NOT ", .))) %>%
str(.)
# Classes 'tbl_df', 'tbl' and 'data.frame': 6 obs. of 20 variables:
# $ ph503_3 : num -1 -1 -1 0 -1 -1
# $ gripstrength : num 33 40 26 30 49 31
# $ IPAQmetminutes : num 5196 198 1674 642 11724 ...
# $ tugtimesec : num 8 7 7 17 9 8
# $ MHcesd : num 1 0 1 12 0 9
# $ id : chr "292221" "334262" "075822" "40642" ...
# $ age : num 58 68 54 64 52 58
# $ COGmmse : num 30 27 29 27 30 29
# $ DISconverse1 : chr "None" "None" "None" "None" ...
# $ MDantidepressant: chr "No" "No" "No" "Yes" ...
# $ MDantipark : num 0 0 0 0 0 0
# $ MDpolypharmacy : num 0 0 0 1 0 0
# $ W2socialclass : chr "Skilled" "Semi-skilled" "Managerial & Technical" "Non-Manual" ...
# $ bh201 : chr "Would never doze" "Would never doze" "Slight chance of dozing" "Would never doze" ...
# $ fl001_01 : int 0 0 0 1 0 0
# $ fl001_02 : int 0 1 0 1 1 0
# $ fl001_04 : int 0 0 0 1 0 0
# $ fl001_05 : int 0 0 0 1 0 0
# $ fl001_06 : int 0 0 0 1 0 0
# $ fl001_07 : int 0 0 0 1 0 0
(I shortened your if_else(grepl("NOT ", .), 0, 1)
to be just +(!grepl("NOT ", .))
, a little for code-golf, a little because I think using ifelse
/ if_else
there is a little more than necessary. It's not wrong, and if your future needs are a little more complex than just 0
/ 1
, then if_else
is certainly good. My trick of +(...)
is a way to quickly convert logical to integer, try +TRUE
.) (我将你的
if_else(grepl("NOT ", .), 0, 1)
缩短为+(!grepl("NOT ", .))
,有点用于代码高尔夫,有点因为我认为使用ifelse
/ if_else
比必要的多一点。这没有错,如果你未来的需求比0
/ 1
更复杂一点,那么if_else
肯定是好的。我的+(...)
技巧是一种快速转换的方法逻辑到整数,尝试+TRUE
。)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.