[英]Failure to recode levels of a factor with dplyr mutate() and recode_factor
[英]Changing factor levels with dplyr mutate
这可能很简单,我觉得问起来很愚蠢。 我想使用 mutate 更改数据框中因子的级别。 简单的例子:
library("dplyr")
dat <- data.frame(x = factor("A"), y = 1)
mutate(dat,levels(x) = "B")
我得到:
Error: Unexpected '=' in "mutate(dat,levels(x) ="
为什么这不起作用? 如何使用 mutate 更改因子水平?
我不太确定我是否正确理解了您的问题,但是如果您想使用mutate()
更改cyl
的因子水平,您可以这样做:
df <- mtcars %>% mutate(cyl = factor(cyl, levels = c(4, 6, 8)))
你会得到:
#> str(df$cyl)
# Factor w/ 3 levels "4","6","8": 2 2 1 2 3 2 3 1 1 2 ...
也许您正在寻找这个 plyr::revalue 函数:
mutate(dat, x = revalue(x, c("A" = "B")))
你也可以看到 plyr::mapvalues 。
您可以使用dplyr
的recode
功能。
df <- iris %>%
mutate(Species = recode(Species, setosa = "SETOSA",
versicolor = "VERSICOLOR",
virginica = "VIRGINICA"
)
)
无法评论,因为我没有足够的声望点,但重新编码仅适用于向量,因此@Stefano 的答案中的上述代码应该是
df <- iris %>%
mutate(Species = recode(Species,
setosa = "SETOSA",
versicolor = "VERSICOLOR",
virginica = "VIRGINICA")
)
根据我的理解,当前接受的答案仅更改 factor levels的顺序,而不是实际标签(即,如何调用因子的级别)。 为了说明levels和labels之间的区别,请考虑以下示例:
将cyl
转换为因子(无需指定级别,因为它们按字母数字顺序编码):
mtcars2 <- mtcars %>% mutate(cyl = factor(cyl, levels = c(4, 6, 8)))
mtcars2$cyl[1:5]
#[1] 6 6 4 6 8
#Levels: 4 6 8
更改级别的顺序(但不是标签本身:cyl 仍然是同一列)
mtcars3 <- mtcars2 %>% mutate(cyl = factor(cyl, levels = c(8, 6, 4)))
mtcars3$cyl[1:5]
#[1] 6 6 4 6 8
#Levels: 8 6 4
all(mtcars3$cyl==mtcars2$cyl)
#[1] TRUE
为cyl
分配新标签 标签的顺序是:c(8, 6, 4),因此我们指定新标签如下:
mtcars4 <- mtcars3 %>% mutate(cyl = factor(cyl, labels = c("new_value_for_8",
"new_value_for_6",
"new_value_for_4" )))
mtcars4$cyl[1:5]
#[1] new_value_for_6 new_value_for_6 new_value_for_4 new_value_for_6 new_value_for_8
#Levels: new_value_for_8 new_value_for_6 new_value_for_4
请注意此列与我们的第一列有何不同:
all(as.character(mtcars4$cyl)!=mtcars3$cyl)
#[1] TRUE
#Note: TRUE here indicates that all values are unequal because I used != instead of ==
#as.character() was required as the levels were numeric and thus not comparable to a character vector
更多细节:
如果我们要使用mtcars2
而不是mtcars3
更改cyl
的mtcars3
,我们需要以不同的方式指定标签以获得相同的结果。 mtcars2
的标签mtcars2
是:c(4, 6, 8),因此我们指定新标签如下
#change labels of mtcars2 (order used to be: c(4, 6, 8)
mtcars5 <- mtcars2 %>% mutate(cyl = factor(cyl, labels = c("new_value_for_4",
"new_value_for_6",
"new_value_for_8" )))
不像mtcars3$cyl
和mtcars4$cyl
的标签mtcars4$cyl
和mtcars5$cyl
因此相同的,即使他们的等级有不同的顺序。
mtcars4$cyl[1:5]
#[1] new_value_for_6 new_value_for_6 new_value_for_4 new_value_for_6 new_value_for_8
#Levels: new_value_for_8 new_value_for_6 new_value_for_4
mtcars5$cyl[1:5]
#[1] new_value_for_6 new_value_for_6 new_value_for_4 new_value_for_6 new_value_for_8
#Levels: new_value_for_4 new_value_for_6 new_value_for_8
all(mtcars4$cyl==mtcars5$cyl)
#[1] TRUE
levels(mtcars4$cyl) == levels(mtcars5$cyl)
#1] FALSE TRUE FALSE
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.