简体   繁体   中英

How to create a set of dummy variables from given answer in R?

I'm really messing with this issue. I have a dataset:

example = data.frame(age = c(34,19,44,22,34,12,54,63,23),
                       wash.hands = c("Before eating","Before eating, on public transportation","Before eating, After eating",
                                      "After eating","on public transportation, when I get home","Before eating",
                                      "When I get home","When I get home, Before eating","on public transportation"),
                     stringsAsFactors = F
                       )

That looks like that:

# age                                wash.hands
#  34                             Before eating
#  19   Before eating, on public transportation
#  44               Before eating, After eating
#  22                              After eating
#  34 on public transportation, when I get home
#  12                             Before eating
#  54                           When I get home
#  63            When I get home, Before eating
#  23                  on public transportation

It contains the respondent's age, and when does he wash his hands. I would like to have a set of 4 dummy variables (Before eating, After eating, On public transportation, When I get home) and have them signed "1" if the respondent washes his hands on a specific occasion, and 0 otherwise. how do I do that??? any help would be appreciated! Thank you! :)

I would use str_detect() to indicate whether a particular set of string is in the variable or not.

library(tidyverse)

mutate(example,
  before_eating = str_detect(wash.hands, "Before eating"),
  after_eating = str_detect(wash.hands, "After eating"),
  public_trans = str_detect(wash.hands, "public transportation"),
  get_home = str_detect(wash.hands, "get home"))

This will return 4 boolean variables, and R treats TRUE as 1 and FALSE as 0, so this should work with whatever analysis you look to do with this.

can be used psych or fastDummies

library(psych)
dummy.code(example$wash.hands)

library(fastDummies)
dummy_cols(example$wash.hands)

Here's a base R approach.

times <- c("Before eating","on public transportation","After eating","When I get home")
result <- lapply(times,function(x){as.numeric(grepl(x,example$wash.hands))})
names(result) <- times
cbind(example,do.call(cbind,result))
  age                                wash.hands Before eating on public transportation After eating When I get home
1  34                             Before eating             1                        0            0               0
2  19   Before eating, on public transportation             1                        1            0               0
3  44               Before eating, After eating             1                        0            1               0
4  22                              After eating             0                        0            1               0
5  34 on public transportation, when I get home             0                        1            0               0
6  12                             Before eating             1                        0            0               0
7  54                           When I get home             0                        0            0               1
8  63            When I get home, Before eating             1                        0            0               1
9  23                  on public transportation             0                        1            0               0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM