I'm a fan of the revalue
function is plyr
for substituting strings. It's simple and easy to remember.
However, I've migrated new code to dplyr
which doesn't appear to have a revalue
function. What is the accepted idiom in dplyr
for doing things previously done with revalue
?
There is a recode
function available starting with dplyr version dplyr_0.5.0 which looks very similar to revalue
from plyr .
Example built from the recode
documentation Examples section:
set.seed(16)
x = sample(c("a", "b", "c"), 10, replace = TRUE)
x
[1] "a" "b" "a" "b" "b" "a" "c" "c" "c" "a"
recode(x, a = "Apple", b = "Bear", c = "Car")
[1] "Car" "Apple" "Bear" "Apple" "Car" "Apple" "Apple" "Car" "Car" "Apple"
If you only define some of the values that you want to recode, by default the rest are filled with NA
.
recode(x, a = "Apple", c = "Car")
[1] "Car" "Apple" NA "Apple" "Car" "Apple" "Apple" "Car" "Car" "Apple"
This behavior can be changed using the .default
argument.
recode(x, a = "Apple", c = "Car", .default = x)
[1] "Car" "Apple" "b" "Apple" "Car" "Apple" "Apple" "Car" "Car" "Apple"
There is also a .missing
argument if you want to replace missing values with something else.
We can do this with chartr
from base R
chartr("ac", "AC", x)
x <- c("a", "b", "c")
I wanted to comment on the answer by @aosmith, but lack reputation. It seems that nowadays the default of dplyr
's recode
function is to leave unspecified levels unaffected.
x = sample(c("a", "b", "c"), 10, replace = TRUE)
x
[1] "c" "c" "b" "b" "a" "b" "c" "c" "c" "b"
recode(x , a = "apple", b = "banana" )
[1] "c" "c" "banana" "banana" "apple" "banana" "c" "c" "c" "banana"
To change all nonspecified levels to NA
, the argument .default = NA_character_
should be included.
recode(x, a = "apple", b = "banana", .default = NA_character_)
[1] "apple" "banana" "apple" "banana" "banana" "apple" NA NA NA "apple"
我发现一个方便的替代方法是data.tables的mapvalues函数
df[, variable := mapvalues(variable, old = old_names_string_vector, new = new_names_string_vector)]
R base solution
You can use ifelse()
from base
for this. The functions arguments are ifelse(test, yes, no)
. Here an example:
(x <- sample(c("a", "b", "c"), 5, replace = TRUE))
[1] "c" "a" "b" "a" "a"
ifelse(x == "a", "Apple", x)
[1] "c" "Apple" "b" "Apple" "Apple"
If you want to recode multiple values you can use the function in a nested way like this:
ifelse(x == "a", "Apple", ifelse(x == "b", "Banana", x))
[1] "c" "Apple" "Banana" "Apple" "Apple"
Own function
Having many values that must be recoded can make coding with ifelse()
messy. Therefor, Ihere is an own function:
my_revalue <- function(vec, ...){
reval <- list(...)
from <- names(reval)
to <- unlist(reval)
out <- eval(parse(text= paste0("{", paste0(paste0("x[x ==", "'", from,"'", "]", "<-", "'", to, "'"), collapse= ";"), ";x", "}")))
return(out)
}
Now we can change multiple values quite fast:
my_revalue(vec= x, "a" = "Apple", "b" = "Banana", "c" = "Cranberry")
[1] "Cranberry" "Apple" "Banana" "Apple" "Apple"
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.