简体   繁体   中英

Recode values in R

I want to recode the values in a column if x is >1 but < 2, it will be recoded as 1

Here's my code:

neu$b <- lapply(neu$swl.y, function(x) ifelse(x>1 & x<=2, 1, x))

Is there sth wrong?

 swl.y

  2.2
  1.2
  3.4
  5.6

I need to recode all the values actually:

  neu$c <- with(neu, ifelse(swl.y>1 & swl.y <=2, 1, swl.y))
  neu$c <- with(neu, ifelse(swl.y>2 & swl.y <=3, 2, swl.y))
  neu$c <- with(neu, ifelse(swl.y>3 & swl.y <=4, 3, swl.y))
  neu$c <- with(neu, ifelse(swl.y>4 & swl.y <=5, 4, swl.y))
  neu$c <- with(neu, ifelse(swl.y>5 & swl.y <=6, 5, swl.y))
  neu$c <- with(neu, ifelse(swl.y>6 & swl.y <=7, 6, swl.y))

I think I know where the problem is. When R runs the second line of code, the recoded values were back to the previous values.

We don't need to loop for a single column. By using lapply(neu$swl.y , we are getting each element of the column as the list element, which we may not need. The function ifelse is vectorized and can be used directly on the column 'swl.y' with the logical condition mentioned in the OP's post.

 neu$b <- with(neu, ifelse(swl.y>1 & swl.y <=2, 1, swl.y))

Or otherwise, we create 'b' column as 'swl.y' and change the values of 'b' based on the logical condition.

 neu$b <- neu$swl.y
 neu$b[with(neu, swl.y>1 & swl.y <=2)] <- 1

To better understand the problem with the OP's code, we can check the output from the lapply

 lapply(neu$swl.y, function(x) x) #similar to `as.list(neu$swl.y)`
 #[[1]]
 #[1] 3

 #[[2]]
 #[1] 0

 #[[3]]
 #[1] 0

 #[[4]]
 #[1] 2

 #[[5]]
 #[1] 1

The output is a list with each element of the column as list elements. Using ifelse on a list may not be optimum as it is vectorized (already mentioned above). But, suppose if we do with ifelse

lapply(neu$swl.y, function(x) ifelse(x>1 & x<=2, 1, x))
#[[1]]
#[1] 3

#[[2]]
#[1] 0

#[[3]]
#[1] 0

#[[4]]
#[1] 1

#[[5]]
#[1] 1

A data.frame can be considered as a list with list elements that are having the same length. So, based on the above output, this should be a data.frame with 5 columns and 1 row. By assinging to a single column 'b', we are instead creating a list column with 5 list elements.

 neu$b <- lapply(neu$swl.y, function(x) ifelse(x>1 & x<=2, 1, x))
 str(neu)
 #'data.frame': 5 obs. of  2 variables:
 #$ swl.y: int  3 0 0 2 1
 #$ b    :List of 5
 # ..$ : int 3
 # ..$ : int 0
 # ..$ : int 0
 # ..$ : num 1
 # ..$ : int 1

But, this is not we wanted. What is the remedy? One way is using sapply/vapply instead of lapply which returns a vector output as the lengths are the same or we unlist the lapply output to create a vector

 neu$b <- sapply(neu$swl.y, function(x) ifelse(x>1 & x<=2, 1, x))
 str(neu) 
 #'data.frame': 5 obs. of  2 variables:
 # $ swl.y: int  3 0 0 2 1
 # $ b    : num  3 0 0 1 1

Update

Based on the OP's edited post, if we need multiple recodes, use either cut or findInterval . In the cut , we can specify the breaks and there are other arguments labels to return the default label or not.

 with(neu1, cut(swl.y, breaks=c(-Inf,1,2,3,4,5,6,Inf), labels=F)-1)
 #[1] 2 1 3 5

data

set.seed(48)
neu <- data.frame(swl.y=sample(0:5, 5, replace=TRUE))

#newdata 
neu1 <- structure(list(swl.y = c(2.2, 1.2, 3.4, 5.6)), 
.Names = "swl.y", class = "data.frame", row.names = c(NA, -4L))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM