简体   繁体   中英

better way to create dummy variables with NA- trying to improve coding I can already do poorly

I have a df with answers to survey questions, where df$Q57 is one of five answers:

  1. "" (<- blank is basically NA)
  2. I would never do this
  3. I will do this in five years
  4. I will do this in 10 years
  5. I will do this eventually

I want to create a dummy variable where:

  1. "" = NA
  2. I would never do this = 0
  3. I will do this in five years = 1
  4. I will do this in 10 years = 1
  5. I will do this eventually = 1

The best way I know how to do this is with a series of ifelse commands:

df$Q57_dummy <- ifelse(df$Q57 == "I would never install water control structures", 0, 1)
df$Q57_dummy <- ifelse(df$Q57 == "", NA, df$Q57_dummy)
table(df$Q57_dummy , useNA = "always")

This works, but I feel like there are cleaner ways to do this, and I was wondering if anyone had suggestions, because I will have to recode survey answers that have more than 1,0,NA outcomes. Thanks!

tidyverse approach:

df %>%
    mutate(Q57_dummy = case_when(
        Q57 == "" ~ NA,
        Q57 == "I would never do this" ~ FALSE,
        TRUE ~ TRUE # this is the else condition
    ))

You could take a few different approaches with the else condition depending on how you prefer your code style. The above works, but you could also do this with stringr :

str_detect(Q57, "I will do this") ~ TRUE

or manually input the options:

Q57 %in% c("I will do this in five years",...) ~ TRUE

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM