简体   繁体   中英

How do I code 2 seperate categorical variables into a single one in R?

I have two continuous variables that I dummy coded into a categorical variable with 2 levels. Each of these variables are coded either 0 or 1 for low and high levels of this variable. Both variables were z-scored to know if they fell below or above the mean.

MeanAboveAvo <- ifelse(Dataframeforstudy2$avo < 0, 0, 1)

MeanAboveAnx <- ifelse(Dataframeforstudy2$anx < 0, 0 , 1)

My question is how do I dummy code these two variables together? I want to create a single variable with 4 different levels using these two variables (MeanAboveAvo & MeanAboveAnx). I want a single variable that is coded with either 1,2,3,4 and the 1 is (0,0), 2 is (0,1), 3 is (1,0) and 4 is (1,1).

My code is this:

stats <- while(MeanAboveAnx = 0 || MeanAboveAvx = 1) {


   if(MeanAboveAnx = 0 & MeanAboveAvo = 0 ){
   1
} 

else if (MeanAboveAnx = 0 & MeanAboveAvo = 1){
 2
}

  else if(MeanAboveAnx = 1 & MeanAboveAvo = 0){
     3
 } 

else {
    4
   }}

It is not coding it at all and I am getting an error message. What can I do differently to get the results I want?

Thank you for your help in advance!

Base R has function interaction precisely for this type of problem. The code below can become a one-liner, I leave it like this in order to make it more clear.

f <- with(df, interaction(anx, avo, lex.order = TRUE))
as.integer(f)
# [1] 1 2 1 1 2 3 3 3 4 2

Edit.

I was using the data in TomasIsCoding's answer, here is a solution more to the question's problem, with anx and avo as z-scores. Thanks to @KonradRudolph for his comment.

f <- with(df, interaction(as.integer(anx < 0), 
                          as.integer(avo < 0), 
                          lex.order = TRUE))
f
# [1] 1.1 0.1 0.1 1.0 0.0 0.1 1.1 1.1 1.1 1.0
#Levels: 0.0 0.1 1.0 1.1

as.integer(f)
# [1] 4 2 2 3 1 2 4 4 4 3

Data.

set.seed(1234)
df <- data.frame(anx = rnorm(10), avo = rnorm(10))

Categorical variables in in R don't need to be numeric (and making them so has several drawbacks!): there's consequently no need for your ifelse :

MeanAboveAvo <- Dataframeforstudy2$avo < 0
MeanAboveAnx <- Dataframeforstudy2$anx < 0

Next, the code using these encodings contains multiple mistakes:

  1. It's not clear what the while here is supposed to mean.
  2. All = signs need to be converted to == because you're performing comparisons .
  3. if , unlike ifelse , isn't vectorised so you cannot use it to assign its result to a vector of length > 1.

If I understand you correctly, then the following is one (canonical) way of encoding the stats :

stats <- paste(MeanAboveAvo, MeanAboveAnx)

This converts the logical vectors into character vectors and concatenates them element-wise. Once again, it is unnecessary (and unconventional!) in R to convert these categories into a numeric variable; though it may make sense to convert it to a factor via as.factor .

From the mapping rule to code the anx and avo , you actually don't need while loop, since yours is a shifted mapping from binary to decimal. In this case, you can do it like below

df <- within(df,code <- 2*anx + avo + 1)

such that

> df
   anx avo code
1    0   0    1
2    0   1    2
3    0   0    1
4    0   0    1
5    0   1    2
6    1   0    3
7    1   0    3
8    1   0    3
9    1   1    4
10   0   1    2

Dummy Data

df <- structure(list(anx = c(0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L, 0L
), avo = c(0L, 1L, 0L, 0L, 1L, 0L, 0L, 0L, 1L, 1L)), class = "data.frame", row.names = c(NA, 
-10L))

Try this:

as.integer(factor(paste0(MeanAboveAvo, MeanAboveAnx)))

For example:

set.seed(123)
x <- sample(0:1, 10, T) # [1] 0 0 0 1 0 1 1 1 0 0
y <- sample(0:1, 10, T) # [1] 1 1 1 0 1 0 1 0 0 0
as.integer(factor(paste0(x, y)))
# [1] 2 2 2 3 2 3 4 3 1 1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM