简体   繁体   中英

How to recode variables in R

I am trying to recode variables in an R dataframe. Example - variable X from my dataset contains 1's and 0's. I want to create another variables Y which recodes 1's & 0's from X into Yes & No respectively.

I tried this to create the recoded Y variable:

w <- as.character()

for (i in seq_along(x))  {
    if (x[i] == 1)  {
        recode <- "Yes"
    } else if (x[i] == 0)  {
        recode <- "No"       
    }
    w <- cbind(w, recode)
}

Then I did this to line-up X and Y together:

y <- c(x, y)

What I got back was this:

 y
 # [1] "1"   "1"   "0"   "1"   "0"   "0"   "1"   "1"   "0"   "1"   "0"   "0"   "Yes" "Yes" "No"  "Yes" "No"  "No" 

I was expecting a dataframe with X & Y columns.

Question:

  1. How do I get X and Y into a dataframe?
  2. Is there a better way for recoding variables in a dataframe?

Recoding is generally about applying new labels to the levels of a factor (categorical variable)

In R, you do that like this:

w <- factor(x, levels = c(1,0), labels = c('yes', 'no'))

Using the following data:

x  <- c(rep.int(0, 10), rep.int(1, 10))
df <- as.data.frame(x)
df
#    x
# 1  0
# 2  0
# 3  0
# ...

I'd create a new variable and recode in one step:

df$y[df$x == 1] <- "yes"
df$y[df$x == 0] <- "no"
df
#    x   y
# 1  0  no
# 2  0  no
# 3  0  no
# ...
# 11 1 yes
# 12 1 yes
# 13 1 yes
# ...

Note for loops are not optimum in R, but your loop is basically correct. You need to replace w <- rbind(w, recode) with w <- cbind(w, recode) in the loop itself and, in the final step, you can cbind x and w :

w <- as.character()
for (i in seq_along(x))  {
  if (x[i] == 1)  {
    recode <- "Yes"
  } else if (x[i] == 0)  {
    recode <- "No"       
  }
  w <- rbind(w, recode)
}
y <- c(x, w)
y

rbind() appends rows, cbind() appends columns, and c() joins two strings together which is why you were getting two lists joined together into one.

This is one of the many cases where you really shouldn't use a loop in R.

Instead, use vectorisation, ie ifelse or indexing.

result = data.frame(x = x, y = ifelse(x == 1, 'yes', 'no'))

(This assumes that there are only 1s and 0s in the input; if that isn't the case, you need a nested ifelse or a list containing the translations).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM