简体   繁体   English

r dplyr mutate_if 多个条件

[英]r dplyr mutate_if multiple conditions

I realise this question has been asked previously, but I can't seem to get the code to work.我意识到之前已经问过这个问题,但我似乎无法让代码正常工作。

Here is my data:这是我的数据:

structure(list(ph503_3 = c(-1, -1, -1, 0, -1, -1), gripstrength = c(33, 
40, 26, 30, 49, 31), IPAQmetminutes = c(5196, 198, 1674, 642, 
11724, 1155), tugtimesec = c(8, 7, 7, 17, 9, 8), MHcesd = c(1, 
0, 1, 12, 0, 9), id = c("292221", "334262", "075822", "40642", 
"274222", "245801"), age = c(58, 68, 54, 64, 52, 58), COGmmse = c(30, 
27, 29, 27, 30, 29), DISconverse1 = c("None", "None", "None", 
"None", "None", "None"), MDantidepressant = c("No", "No", "No", 
"Yes", "No", "No"), MDantipark = c(0, 0, 0, 0, 0, 0), MDpolypharmacy = c(0, 
0, 0, 1, 0, 0), W2socialclass = c("Skilled", "Semi-skilled", 
"Managerial & Technical", "Non-Manual", "Skilled", "Managerial & Technical"
), bh201 = c("Would never doze", "Would never doze", "Slight chance of dozing", 
"Would never doze", "High chance of dozing", "Would never doze"
), fl001_01 = c("NOT Walking 100 metres (100 yards)", "NOT Walking 100 metres (100 yards)", 
"NOT Walking 100 metres (100 yards)", "Walking 100 metres (100 yards)", 
"NOT Walking 100 metres (100 yards)", "NOT Walking 100 metres (100 yards)"
), fl001_02 = c("NOT Running or jogging about 1.5 kilometres (1 mile)", 
"Running or jogging about 1.5 kilometres (1 mile)", "NOT Running or jogging about 1.5 kilometres (1 mile)", 
"Running or jogging about 1.5 kilometres (1 mile)", "Running or jogging about 1.5 kilometres (1 mile)", 
"NOT Running or jogging about 1.5 kilometres (1 mile)"), fl001_04 = c("NOT Getting up from a chair after sitting for long periods", 
"NOT Getting up from a chair after sitting for long periods", 
"NOT Getting up from a chair after sitting for long periods", 
"Getting up from a chair after sitting for long periods", "NOT Getting up from a chair after sitting for long periods", 
"NOT Getting up from a chair after sitting for long periods"), 
    fl001_05 = c("NOT Climbing several flights of stairs without resting", 
    "NOT Climbing several flights of stairs without resting", 
    "NOT Climbing several flights of stairs without resting", 
    "Climbing several flights of stairs without resting", "NOT Climbing several flights of stairs without resting", 
    "NOT Climbing several flights of stairs without resting"), 
    fl001_06 = c("NOT Climbing one flight of stairs without resting", 
    "NOT Climbing one flight of stairs without resting", "NOT Climbing one flight of stairs without resting", 
    "Climbing one flight of stairs without resting", "NOT Climbing one flight of stairs without resting", 
    "NOT Climbing one flight of stairs without resting"), fl001_07 = c("NOT Stooping, kneeling, or crouching", 
    "NOT Stooping, kneeling, or crouching", "NOT Stooping, kneeling, or crouching", 
    "Stooping, kneeling, or crouching", "NOT Stooping, kneeling, or crouching", 
    "NOT Stooping, kneeling, or crouching")), row.names = c(NA, 
-6L), class = c("tbl_df", "tbl", "data.frame"))

My code:我的代码:

mydata <- mydata %>% 
  mutate_if(class(.)=="character" & str_detect(colnames(.), "^fl\\d|^ph\\d"), ~if_else(grepl("NOT ", .), 0, 1)) 

The code runs, but nothing happens and I get the following message when I knit the markdown:代码运行,但没有任何反应,当我编织降价时,我收到以下消息:

Warning message:
In class(.) == "character" & str_detect(colnames(.), "^fl\\d|^ph\\d") :
  longer object length is not a multiple of shorter object length

The conditional needs to return multiple columns all at once, but reading class(.) == "character" makes me believe you are checking one column at a time.条件需要一次返回多列,但阅读class(.) == "character"让我相信你一次检查一列。 The . . is replaced internally with the whole frame not individual columns/vectors.在内部替换为整个框架而不是单独的列/向量。 The second half of your conditional is correct, but the first:你的条件的后半部分是正确的,但第一部分:

myfunc <- function(...) { browser(); TRUE; }
mydata %>% mutate_if(myfunc(.), ~ 1)
# Browse[2]>
list(...)
# [[1]]
# # A tibble: 6 x 20
#   ph503_3 gripstrength IPAQmetminutes tugtimesec MHcesd id      age COGmmse DISconverse1
#     <dbl>        <dbl>          <dbl>      <dbl>  <dbl> <chr> <dbl>   <dbl> <chr>       
# 1      -1           33           5196          8      1 2922~    58      30 None        
# 2      -1           40            198          7      0 3342~    68      27 None        
# 3      -1           26           1674          7      1 0758~    54      29 None        
# 4       0           30            642         17     12 40642    64      27 None        
# 5      -1           49          11724          9      0 2742~    52      30 None        
# 6      -1           31           1155          8      9 2458~    58      29 None        
# # ... with 11 more variables: MDantidepressant <chr>, MDantipark <dbl>, MDpolypharmacy <dbl>,
# #   W2socialclass <chr>, bh201 <chr>, fl001_01 <chr>, fl001_02 <chr>, fl001_04 <chr>,
# #   fl001_05 <chr>, fl001_06 <chr>, fl001_07 <chr>

In that context, class(whole_data_frame) == "character" does not make sense (by itself).在这种情况下, class(whole_data_frame) == "character"没有意义(就其本身而言)。

You can look for character columns using sapply(., is.character) (or one of purrr 's equivalents):您可以使用sapply(., is.character) (或purrr的等效项之一sapply(., is.character)查找字符列:

mydata %>% 
  mutate_if(sapply(., is.character) &
              stringr::str_detect(colnames(.), "^fl\\d|^ph\\d"),
            ~ +(!grepl("NOT ", .))) %>%
  str(.)
# Classes 'tbl_df', 'tbl' and 'data.frame': 6 obs. of  20 variables:
#  $ ph503_3         : num  -1 -1 -1 0 -1 -1
#  $ gripstrength    : num  33 40 26 30 49 31
#  $ IPAQmetminutes  : num  5196 198 1674 642 11724 ...
#  $ tugtimesec      : num  8 7 7 17 9 8
#  $ MHcesd          : num  1 0 1 12 0 9
#  $ id              : chr  "292221" "334262" "075822" "40642" ...
#  $ age             : num  58 68 54 64 52 58
#  $ COGmmse         : num  30 27 29 27 30 29
#  $ DISconverse1    : chr  "None" "None" "None" "None" ...
#  $ MDantidepressant: chr  "No" "No" "No" "Yes" ...
#  $ MDantipark      : num  0 0 0 0 0 0
#  $ MDpolypharmacy  : num  0 0 0 1 0 0
#  $ W2socialclass   : chr  "Skilled" "Semi-skilled" "Managerial & Technical" "Non-Manual" ...
#  $ bh201           : chr  "Would never doze" "Would never doze" "Slight chance of dozing" "Would never doze" ...
#  $ fl001_01        : int  0 0 0 1 0 0
#  $ fl001_02        : int  0 1 0 1 1 0
#  $ fl001_04        : int  0 0 0 1 0 0
#  $ fl001_05        : int  0 0 0 1 0 0
#  $ fl001_06        : int  0 0 0 1 0 0
#  $ fl001_07        : int  0 0 0 1 0 0

(I shortened your if_else(grepl("NOT ", .), 0, 1) to be just +(!grepl("NOT ", .)) , a little for code-golf, a little because I think using ifelse / if_else there is a little more than necessary. It's not wrong, and if your future needs are a little more complex than just 0 / 1 , then if_else is certainly good. My trick of +(...) is a way to quickly convert logical to integer, try +TRUE .) (我将你的if_else(grepl("NOT ", .), 0, 1)缩短为+(!grepl("NOT ", .)) ,有点用于代码高尔夫,有点因为我认为使用ifelse / if_else比必要的多一点。这没有错,如果你未来的需求比0 / 1更复杂一点,那么if_else肯定是好的。我的+(...)技巧是一种快速转换的方法逻辑到整数,尝试+TRUE 。)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM