I know how to conditionally replace levels of a variable using dplyr/tidyr. Here's some toy data (the real dataset is much larger and more complex):
dat <- data.frame(animal=c("cat", "cat", "dog", "cat"),
size=c("big", "big", "big", "small"))
newdata <- dat %>% mutate(newanimal=replace(animal, animal=='cat' & size=='big', "fatcat"))
And I keep getting "invalid factor level, NA generated" - why?, These are factor variables. the specific combination of 'cat' and 'big' exists in the dataframe? Why do I get this error?
As @camille mentioned, once you have a factor, it's locked in, and if you introduce new "entries", it becomes NA.
For example:
x <- factor(letters[1:3])
x[3] = "d"
Warning message:
In `[<-.factor`(`*tmp*`, 3, value = "d") :
invalid factor level, NA generated
x
[1] a b <NA>
Levels: a b c
The only way to get out of this, is to convert it to character first and replace:
newdata <- dat %>% mutate(newanimal=replace(as.character(animal), animal=='cat' & size=='big', "fatcat"))
newdata
animal size newanimal
1 cat big fatcat
2 cat big fatcat
3 dog big dog
4 cat small cat
Your new column is a character now, but you can always convert it back to a factor, if you need that..
str(newdata)
'data.frame': 4 obs. of 3 variables:
$ animal : Factor w/ 2 levels "cat","dog": 1 1 2 1
$ size : Factor w/ 2 levels "big","small": 1 1 1 2
$ newanimal: chr "fatcat" "fatcat" "dog" "cat"
Another option in the tidyverse is to use forcats::fct_expand
to add the new level and then pipe this vector into the original replace
which will now work as expected. The new variable is a factor and no further conversion is necessary (given that your desired output is a factor).
library(tidyverse)
dat <- dat %>%
mutate(newanimal = fct_expand(animal, "fatcat") %>%
replace(., animal == "cat" & size == "big", "fatcat")
)
glimpse(dat)
Observations: 4
Variables: 3
$ animal <fct> cat, cat, dog, cat
$ size <fct> big, big, big, small
$ newanimal <fct> fatcat, fatcat, dog, cat
If you use this kind of factor replacement a lot, you could write your own helper function:
replace_fct <- function(x, list, values) {
.x = forcats::fct_expand(x, unique(values))
replace(.x, list, values)
}
And then do:
dat %>%
mutate(newanimal = replace_fct(animal, animal == "cat" & size == "big", "fatcat")
)
You can try this
library(tidyverse)
dat <- tibble(animal = c("cat","dog","cat","dog","dog","dog"),
size = c("big", "small", "big", "big", "big","big"))
dat %>% mutate(new_animal = ifelse(animal=='cat' & size=='big','fatcat',animal) ) %>%
mutate_if(is.character, as.factor)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.