如何从R dplyr中的group_by返回值？

Question

Good morning, 早上好，

I've got a two-column dataset which I'd like to spread to more columns based on a group_by in Dplyr but I'm not sure how. 我有一个两列的数据集，我想根据Dplyr中的group_by扩展到更多列，但是我不确定如何。

My data looks like: 我的数据如下：

Person     Case
John       A
John       B
Bill       C
David      F

I'd like to be able to transform it to the following structure: 我希望能够将其转换为以下结构：

Person  Case_1  Case_2 ... Case_n
John    A       B
Bill    C       NA
David   F       NA

My original thought was along the lines of: 我最初的想法是：

data %>%
  group_by(Person) %>%
  spread()

Error: Please supply column name

What's the easiest, or most R-like way to achieve this? 什么是最简单或最像R的方式来实现这一目标？

Answer 1

You should first add a case id to the dataset, which can be done with a combination of group_by and mutate : 您应该首先将案例ID添加到数据集中，这可以通过group_by和mutate的组合来完成：

dat = data.frame(Person = c('John', 'John', 'Bill', 'David'), Case = c('A', 'B', 'C', 'F'))
dat = dat %>% group_by(Person) %>% mutate(id = sprintf('Case_%d', row_number()))
dat %>% head()
# A tibble: 4 × 3
  Person   Case     id
  <fctr> <fctr>  <chr>
1   John      A Case_1
2   John      B Case_2
3   Bill      C Case_1
4  David      F Case_1

Now you can use spread to transform the data: 现在，您可以使用spread来转换数据：

dat %>% spread(Person, Case)
# A tibble: 2 × 4
      id   Bill  David   John
*  <chr> <fctr> <fctr> <fctr>
1 Case_1      C      F      A
2 Case_2     NA     NA      B

You can get the structure you list above using: 您可以使用以下方法获取上面列出的结构：

res = dat %>% spread(Person, Case) %>% select(-id) %>% t() %>% as.data.frame()
names(res) = unique(dat$id)
res
      Case_1 Case_2
Bill       C   <NA>
David      F   <NA>
John       A      B

如何从R dplyr中的group_by返回值？

问题描述

1 个解决方案

解决方案1
4 已采纳 2017-03-31 08:41:27

如何从R dplyr中的group_by返回值？

问题描述

1 个解决方案

解决方案1 4 已采纳 2017-03-31 08:41:27

解决方案1
4 已采纳 2017-03-31 08:41:27