简体   繁体   中英

convert character column and then split it into multiple new boolean columns using r mutate

I am attempting to split out a flags column into multiple new columns in r using mutate_at and then separate functions. I have simplified and cleaned my solution as seen below, however I am getting an error that indicates that the entire column of data is being passed into my function rather than each row individually. Is this normal behaviour which just requires me to loop over each element of x inside my function? or am I calling the mutate_at function incorrectly?

example data:

dataVariable <- data.frame(c_flags = c(".q.q.q","y..i.o","0x5a",".lll.."))

functions:

dataVariable <- read_csv("...",
  col_types = cols(
    c_date = col_datetime(format = ""),
    c_dbl = col_double(),
    c_flags = col_character(),
    c_class = col_factor(c("a", "b", "c")),
    c_skip = col_skip()
))


funTranslateXForNewColumn <- function(x){
    binary = ""
    if(startsWith(x, "0x")){
        binary=hex2bin(x)
    } else {
        binary = c(0,0,0,0,0,0)
        splitFlag = strsplit(x, "")[[1]]
        for(i in splitFlag){
          flagVal = 1
          if(i=="."){
            flagVal = 0
          }
          binary=append(binary, flagVal)
        }
    }
    return(paste(binary[4:12], collapse='' ))
}



mutate_at(dataVariable, vars(c_flags), funs(funTranslateXForNewColumn(.)))

separate(dataVariable, c_flags, c(NA, "flag_1","flag_2","flag_3","flag_4","flag_5","flag_6","flag_7","flag_8","flag_9"), sep="")

The error I am receiving is:

Warning messages:
1: Problem with `mutate()` input `c_flags`.
i the condition has length > 1 and only the first element will be used

After translating the string into an appropriate binary representation of the flags, I will then use the seperate function to split it into new columns.

I was able to get the outcome I desired by replacing the mutate_at function with:

dataVariable$binFlags <- mapply(funTranslateXForNewColumn, dataVariable$c_flags)

However I want to know how to use the mutate_at function correctly.

credit to: https://datascience.stackexchange.com/questions/41964/mutate-with-custom-function-in-r-does-not-work

The above link also includes the solution to get this function to work which is to vectorize the function:

v_funTranslateXForNewColumn <- Vectorize(funTranslateXForNewColumn)
mutate_at(dataVariable, vars(c_flags), funs(v_funTranslateXForNewColumn(.)))

Similar to OP's logic but maybe shorter :

dataVariable$binFlags <- sapply(strsplit(dataVariable$c_flags, ''), function(x)
                                 paste(as.integer(x != '.'), collapse = ''))

If you want to do this using dplyr we can implement the same logic as :

library(dplyr)

dataVariable %>%
  mutate(binFlags = purrr::map_chr(strsplit(c_flags, ''), 
                     ~paste(as.integer(. != '.'), collapse = '')))

#  c_flags binFlags
#1  .q.q.q   010101
#2  y..i.o   100101
#3  .lll..   011100

mutate_at / across is used when you want to apply a function to multiple columns. Moreover, I don't see here that you are creating only one new binary column and not multiple new columns as mentioned in your post.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM