R：如何使用 tydyverse 从列中将因子水平提取为数字并将其分配给新列？

Question

Suppose I have a data frame, df假设我有一个数据框 df

df = data.frame(name = rep(c("A", "B", "C"), each = 4))

I want to get a new data frame with one additional column named Group , in which Group element is the numeric value of the corresponding level of name , as shown in df2 .我想得到一个新的数据框，其中包含一个名为Group的附加列，其中Group元素是对应级别name的数值，如df2所示。

I know case_when could do it.我知道case_when可以做到。 My issue is that my real data frame is quite complicated, there are many levels of the name column.我的问题是我的真实数据框非常复杂， name列有很多级别。 I am too lazy to list case by case.我懒得逐个列出。

Is there an easier and smarter way to do it?有没有更简单、更聪明的方法呢？

Thanks.谢谢。

df2
   name Group
1     A     1
2     A     1
3     A     1
4     A     1
5     B     2
6     B     2
7     B     2
8     B     2
9     C     3
10    C     3
11    C     3
12    C     3

Answer 1

There are a few ways to do it in tidyverse在tidyverse中有几种方法可以做到这一点

library(tidyverse)

df %>% group_by(name) %>% mutate(Group = cur_group_id())

or或者

df %>% mutate(Group = as.numeric(as.factor(name)))

Output Output

Answer 2

A couple other simple solutions:其他几个简单的解决方案：

library(dplyr)

df %>%
  mutate(Group = match(name, unique(name)))
#>    name Group
#> 1     A     1
#> 2     A     1
#> 3     A     1
#> 4     A     1
#> 5     B     2
#> 6     B     2
#> 7     B     2
#> 8     B     2
#> 9     C     3
#> 10    C     3
#> 11    C     3
#> 12    C     3

df %>%
  mutate(Group = cumsum(name != lag(name, default = "")))
#>    name Group
#> 1     A     1
#> 2     A     1
#> 3     A     1
#> 4     A     1
#> 5     B     2
#> 6     B     2
#> 7     B     2
#> 8     B     2
#> 9     C     3
#> 10    C     3
#> 11    C     3
#> 12    C     3

Answer 3

data.table data.table

df = data.frame(name = rep(c("A", "B", "C"), each = 4))

library(data.table)
setDT(df)[, grp := .GRP, by = name][]
#>     name grp
#>  1:    A   1
#>  2:    A   1
#>  3:    A   1
#>  4:    A   1
#>  5:    B   2
#>  6:    B   2
#>  7:    B   2
#>  8:    B   2
#>  9:    C   3
#> 10:    C   3
#> 11:    C   3
#> 12:    C   3

^{Created on 2022-02-10 by the reprex package (v2.0.1)}^{由reprex package (v2.0.1) 创建于 2022-02-10}

R：如何使用 tydyverse 从列中将因子水平提取为数字并将其分配给新列？

问题描述

3 个解决方案

解决方案1
3 已采纳 2022-02-10 11:13:41

Output Output

解决方案2
1 2022-02-10 11:36:57

解决方案3
1 2022-02-10 11:46:28

R：如何使用 tydyverse 从列中将因子水平提取为数字并将其分配给新列？

问题描述

3 个解决方案

解决方案1 3 已采纳 2022-02-10 11:13:41

Output Output

解决方案2 1 2022-02-10 11:36:57

解决方案3 1 2022-02-10 11:46:28

解决方案1
3 已采纳 2022-02-10 11:13:41

解决方案2
1 2022-02-10 11:36:57

解决方案3
1 2022-02-10 11:46:28