[英]How can I transpose data in each variable from long to wide using group_by? R
I have a dataframe with id variable name
.我有一个带有 id variable name
的 dataframe 。 I'm trying to figure out a way to transpose each variable in the dataframe by name.我试图找出一种方法来按名称转置 dataframe 中的每个变量。
My current df is below:我当前的df如下:
name jobtitle companyname datesemployed empduration joblocation jobdescrip
1 David… Project… EOS IT Man… Aug 2018 – P… 1 yr 9 mos San Franci… Coordinati…
2 David… Technic… Options Te… Sep 2017 – J… 5 mos Belfast, U… Working wi…
3 David… Data An… NA Jan 2018 – J… 6 mos Belfast, U… Working wi…
However, I'd like a dataframe in which there is only one row for name, and every observation for name becomes its own column, like below:但是,我想要一个 dataframe ,其中名称只有一行,名称的每个观察值都成为自己的列,如下所示:
name jobtitle_1 companyname_1 datesemployed_1 empduration_1 joblocation_1 jobdescrip_1 job_title2 companyname_2 datesemployed_2 empduration_2 joblocation_2 jobdescrip_2
1 David… Project… EOS IT Man… Aug 2018 – P… 1 yr 9 mos San Franci… Coordinati… Technic… Options Te… Sep 2017 – J… 5 mos Belfast, U… Working wi…
I have used commands like gather_by
and melt
in the past to reshape from long to wide, but in this case, I'm not sure how to apply it, since every observation for the id variable will need to become its own column.我过去使用过诸如gather_by
和melt
之类的命令来从长到宽重塑,但在这种情况下,我不确定如何应用它,因为对id 变量的每个观察都需要成为它自己的列。
It sounds like you are looking for gather and pivot_wider .听起来您正在寻找collect 和 pivot_wider 。
I used my own sample data with two names:我使用了我自己的样本数据,有两个名称:
df <- tibble(name = c('David', 'David', 'David', 'Bill', 'Bill'),
jobtitle = c('PM', 'TPM', 'Analyst', 'Dev', 'Eng'),
companyname = c('EOS', 'Options', NA, 'Microsoft', 'Nintendo'))
First add an index column to distinguish the different positions for each name.首先添加一个索引列,以区分每个名称的不同位置。
indexed <- df %>%
group_by(name) %>%
mutate(.index = row_number())
indexed
# name jobtitle companyname .index
# <chr> <chr> <chr> <int>
# 1 David PM EOS 1
# 2 David TPM Options 2
# 3 David Analyst NA 3
# 4 Bill Dev Microsoft 1
# 5 Bill Eng Nintendo 2
Then it is possible to use gather
to get a long form, with one value per row.然后可以使用gather
得到一个长格式,每行一个值。
gathered <- indexed %>% gather('var', 'val', -c(name, .index))
gathered
# name .index var val
# <chr> <int> <chr> <chr>
# 1 David 1 jobtitle PM
# 2 David 2 jobtitle TPM
# 3 David 3 jobtitle Analyst
# 4 Bill 1 jobtitle Dev
# 5 Bill 2 jobtitle Eng
# 6 David 1 companyname EOS
# 7 David 2 companyname Options
# 8 David 3 companyname NA
# 9 Bill 1 companyname Microsoft
# 10 Bill 2 companyname Nintendo
Now pivot_wider
can be used to create a column for each variable and index.现在可以使用pivot_wider
为每个变量和索引创建一个列。
gathered %>% pivot_wider(names_from = c(var, .index), values_from = val)
# name jobtitle_1 jobtitle_2 jobtitle_3 companyname_1 companyname_2 companyname_3
# <chr> <chr> <chr> <chr> <chr> <chr> <chr>
# 1 David PM TPM Analyst EOS Options NA
# 2 Bill Dev Eng NA Microsoft Nintendo NA
Get the data in long format, create a unique column identifier and get it back to wide format.获取长格式数据,创建唯一的列标识符并将其恢复为宽格式。
library(dplyr)
library(tidyr)
df %>%
pivot_longer(cols = -name, names_to = 'col') %>%
group_by(name, col) %>%
mutate(row = row_number()) %>%
pivot_wider(names_from = c(col, row), values_from = value)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.