简体   繁体   中英

R concatenating two factors

This is making me feel dumb, but I am trying to produce a single vector/df/list/etc (anything but a matrix) concatenating two factors. Here's the scenario. I have a 100k line dataset. I used the top half to predict the bottom half and vice versa using knn . So now I have 2 objects created by knn predict() .

> head(pred11)
[1] 0 0 0 0 0 0
Levels: 0 1
> head(pred12)
[1] 0 1 1 0 0 0
Levels: 0 1
> class(pred11)
[1] "factor"
> class(pred12)
[1] "factor"

Here's where my problem starts:

> pred13 <- rbind(pred11, pred12)
> class(pred13)
[1] "matrix"

There are 2 problems. First it changes the 0's and 1's to 1's and 2's and second it seems to create a huge matrix that's eats all my memory. I've tried messing with as.numeric() , data.frame() , etc, but can't get it to just combine the 2 50k factors into 1 100k one. Any suggestions?

@James presented one way, I'll chip in with another (shorter):

set.seed(42)
x1 <- factor(sample(0:1,10,replace=T))
x2 <- factor(sample(0:1,10,replace=T))

unlist(list(x1,x2))
# [1] 1 1 0 1 1 1 1 0 1 1 0 1 1 0 0 1 1 0 0 1
#Levels: 0 1

...This might seem a bit like magic, but unlist has special support for factors for this particular purpose! All elements in the list must be factors for this to work.

rbind will create 2 x 50000 matrix in your case which isn't what you want. c is the correct function to combine 2 vectors in a single longer vector. When you use rbind or c on a factor, it will use the underlying integers that map to the levels. In general you need to combine as a character before refactoring:

x1 <- factor(sample(0:1,10,replace=T))
x2 <- factor(sample(0:1,10,replace=T))

factor(c(as.character(x1),as.character(x2)))
 [1] 1 1 1 0 1 1 0 1 0 0 0 1 1 1 1 1 1 0 0 0
Levels: 0 1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM