在数据框中使用两个因子名称和水平顺序的变量来更改R中的因子水平

Question

I have a large data frame 1 with a lot of columns that are factors. 我有一个很大的数据框1，其中有很多列是要考虑的因素。 I want to change factor level order for each factor. 我想更改每个因子的因子水平顺序。

I have a lookup data frame 2 for the right factor level orders. 我有一个正确的因子水平顺序的查找数据框2。 This means I can refer to the lookup data frame using a variable for the factor. 这意味着我可以使用变量作为因子来引用查询数据帧。 I can grab the order and put it in a different variable. 我可以抓取订单并将其放在其他变量中。 So far so good. 到现在为止还挺好。

Simplified example: 简化示例：

d = tibble(
  size = c('small','small','big', NA)
)
d$size = as.factor(d$size)

levels(d$size) # Not what I want.

proper.order = c('small', 'big') # this comes from somewhere else

I can use proper.order to change one column in d. 我可以使用proper.order更改d中的一列。

d$size = factor(d$size, levels = proper.order)

levels(d$size) # What I want.

I want to refer to the column name ( size ) using a variable. 我想使用变量引用列名（ size ）。

This doesn't work: 这不起作用：

my.column = 'size'

d[names(d) == my.column] = factor(d[names(d) == my.column], levels = proper.order, exclude = NULL)


levels(d$size) # What I want.
d # Not what I want.

I expect to see the factor reordered. 我希望看到因素重新排序。 This happens. 有时候是这样的。 I expect the factor to keep its values (obviously). 我希望该因素能够保持其价值（显然）。 They are all set to NA. 它们都设置为NA。

I suspect this is because d[names(d) == my.column] is a tibble, not a factor. 我怀疑这是因为d[names(d) == my.column]是一个小问题，而不是一个因素。 But then why do factor levels change? 但是，为什么因子水平发生变化？ And how can I reach into the tibble and grab the factor? 而我该如何深入讨论并抓住因素呢？

Answer 1

For multiple columns, we can specify in mutate_at 对于多列，我们可以在mutate_at指定

library(dplyr)
d %>% 
   mutate_at(vars(my.column), 
        list(~ factor(., levels = proper.order, exclude = NULL)))

Or with fct_relevel from forcats 或者使用fct_relevel的forcats

library(forcats)
d %>%
    mutate_at(vars(my.column), list(~ fct_relevel(., proper.order)))

在数据框中使用两个因子名称和水平顺序的变量来更改R中的因子水平

问题描述

1 个解决方案

解决方案1
2 已采纳 2019-04-11 16:38:27

在数据框中使用两个因子名称和水平顺序的变量来更改R中的因子水平

问题描述

1 个解决方案

解决方案1 2 已采纳 2019-04-11 16:38:27

解决方案1
2 已采纳 2019-04-11 16:38:27