简体   繁体   中英

Recode only certain values and keep others as it is in R

I am trying to recode a list of columns var1:var8 in df - "sampledf" where I am changing the values "B" and "D" into "0", but keeping the other values as it is.

sampledf <- data.frame(
    var1 = c(1,4,2,1,1,0,0,1,0,0,0),
  var2 = c(1,1,"D",1,0,0,1,"B",0,"D",0),
  var3 = c(1,5,2,1,"B",0,1,1,1,0,0),
  var4 = c(1,1,0,1,2,0,1,1,5,1,1),
  var5 = c(0,4,"D",1,0,0,0,1,1,1,1),
  var6 = c(1,"D",0,1,0,2,1,1,0,1,0),
  var7 = c(1,1,0,0,1,"E",1,0,"D",1,1),
  var8 = c(1,1,0,0,2,5,1,"D",0,3,1))

This is what I tried but did not work. Compared to this example, the other values I have in my real dataset is very very long. So I cannot manually supply all the values. All I want is just to change this and keep others as it is.

sampledfnew <- sampledf %>% mutate(across(var1:var2, ~recode(
  TRUE ~ X,

Can anyone help me fix the error here? Thank you

There are many ways to do this. Using ifelse -


change_values <- c('B', 'D')
sampledf %>% mutate(across(var1:var2, ~ifelse(.x %in% change_values, 0, .x)))

#   var1 var2 var3 var4 var5 var6 var7 var8
#1     1    1    1    1    0    1    1    1
#2     4    1    5    1    4    D    1    1
#3     2    0    2    0    D    0    0    0
#4     1    1    1    1    1    1    0    0
#5     1    0    B    2    0    0    1    2
#6     0    0    0    0    0    2    E    5
#7     0    1    1    1    0    1    1    1
#8     1    0    1    1    1    1    0    D
#9     0    0    1    5    1    0    D    0
#10    0    0    0    1    1    1    1    3
#11    0    0    0    1    1    0    1    1

Alternatives to ifelse , since it is prone to at least two not-insignificant issues ( class-dropping and class-ambiguity, discussed below).

sampledf %>%
    across(var1:var8, ~ if_else(
      . %in% c("B", "D"),
      if (is.character(.)) "0" else 0, # could also be maybechar(0, .) from below
#    var1 var2 var3 var4 var5 var6 var7 var8
# 1     1    1    1    1    0    1    1    1
# 2     4    1    5    1    4    0    1    1
# 3     2    0    2    0    0    0    0    0
# 4     1    1    1    1    1    1    0    0
# 5     1    0    0    2    0    0    1    2
# 6     0    0    0    0    0    2    E    5
# 7     0    1    1    1    0    1    1    1
# 8     1    0    1    1    1    1    0    0
# 9     0    0    1    5    1    0    0    0
# 10    0    0    0    1    1    1    1    3
# 11    0    0    0    1    1    0    1    1

In case you don't always want B/D to be replaced with the same value,

maybechar <- function(val, src) if (is.character(src)) as.character(val) else val
sampledf %>%
    across(var1:var8, ~ case_when(
      . == "B" ~ maybechar(0, .),
      . == "D" ~ maybechar(0, .),
      TRUE ~ .)


  • Most of the replacement being doing here is actually replacing with a "0" string instead of a 0 integer, because most of your data is string.

  • The use of ifelse by itself is something I often recommend against due to class ambiguity. It is feasible with ifelse to change the class of the return value without realizing it. See the difference between ifelse(c(T,T), 1:2, c("A","B")) and compare with ifelse(c(T,F), 1:2, c("A","B")) to see what I mean. This is "dangerous"/risky, and one thing that if_else explicitly guards against. (This also is enforced by case_when in my second code block.)

  • It is because of the previous bullet that I suggested the use of something like maybechar , which might suggest a little sloppy code but at least is a little more declarative/intentional about it. I give two ways to do it: the first is explicitly without a helper function, shown in the if_else example above, the second is with the helper function. It seems more prudent to use the helper function in the case of case_when , since the operation is being doing multiple times, so the code is a little easier to read (imo).

Another base R solution is:

sampledf[apply(sampledf, 2, \(x) x %in% c("B", "D"))] <- 0

> sampledf
   var1 var2 var3 var4 var5 var6 var7 var8
1     1    1    1    1    0    1    1    1
2     4    1    5    1    4    0    1    1
3     2    0    2    0    0    0    0    0
4     1    1    1    1    1    1    0    0
5     1    0    0    2    0    0    1    2
6     0    0    0    0    0    2    E    5
7     0    1    1    1    0    1    1    1
8     1    0    1    1    1    1    0    0
9     0    0    1    5    1    0    0    0
10    0    0    0    1    1    1    1    3
11    0    0    0    1    1    0    1    1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM