简体   繁体   中英

Convert factor into logical datatype

I have a two levels factor in my data that I want to convert to logical

a <- str(df$y)
a
Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...

I use as.logical(df$y) to convert them into logical. However, the data turn into NA

summary(a)

      Mode    NA's 
    logical  500000

At which point do I fail to convert the data?

At which point do I fail to convert the data?

I'd argue that you at no point fail to convert the data, it's the function that is a bit odd and fails to understand the nature of your data.

If you read ?as.logical you'll see that when input is factor the levels (which are character) are used in the conversion. The only valid character strings are all variations of "true" and "false", everything else, including "0" and "1", returns NA. 0 and 1 are however interpreted as FALSE and TRUE , respectively, when they are given as numeric, hence all the following works:

y <- factor(c(0, 1, 1, 0))

as.logical(as.integer(levels(y)[y]))
as.logical(as.integer(y) - 1L)
as.logical(as.integer(as.character(y)))

A bit cumbersome, I know, but that's how it is.

Indeed, there is a strightforward method.

As you have 2 levels factor, identify whats true and false

df <- data.frame(y=factor(sample(c("0","1"),10,replace = TRUE)))

str(df$y)
#  Factor w/ 2 levels "0","1": 2 2 2 1 1 2 2 2 2 2

levels(df$y) <- c(FALSE,TRUE)
df$y <- as.logical(df$y)

str(df$y)
# logi [1:10] TRUE TRUE TRUE FALSE FALSE TRUE ...

This is probably a little too late to be helpful, but I ran into a similar problem and found a fix:

as.logical(as.integer(data.frame$column))

should do the trick.

You can use == to create TRUE and FALSE values:

y = factor(c(0, 1, NA))
y == "1"
# [1] FALSE  TRUE    NA

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM