简体   繁体   English

不使用idvar重塑数据框-R

[英]Reshape data frame with no idvar - R

Suppose I have this data frame: 假设我有这个数据框:

name <- rep(LETTERS[seq(from=1, to =2)], each=3)
MeasA <- c(1:6)
MeasB <- c(7:12)

df <- data.frame(name, MeasA, MeasB)

And I want to reshape into a format which has no idvar like this: 我想重塑成没有idvar这样的格式:

MeasA_A MeasB_A MeasB_B MeasB_B
 1        7        4      10
 2        8        5      11
 3        9        6      12

I have been reading about reshape and melt: 我一直在阅读有关重塑和融化的内容:

Reshaping data frame with duplicates 用重复项重塑数据框

http://seananderson.ca/2013/10/19/reshape.html http://seananderson.ca/2013/10/19/reshape.html

But with those functions I need to specify an idvar. 但是使用这些功能,我需要指定一个idvar。 Ive tried: 我试过了:

tt <- reshape(df, timevar = "name", direction="wide")

and

tt <- dcast(df, ~name)

But they clearly dont work. 但是它们显然不起作用。 Perhaps I need to use split ( Split data.frame based on levels of a factor into new data.frames ) then a reshape? 也许我需要使用split( 将因子水平将data.frame拆分为新的data.frames )然后进行整形

We could split the data.frame to list by the 'name' column, cbind the list elements. 我们可以通过“名称”列将data.frame splitlist ,并cbind list元素。 We can change the column names using sub or paste . 我们可以使用subpaste更改列名称。

res <- do.call(cbind,split(df[-1], df$name))
colnames(res) <- sub('([^.]+)\\.([^.]+)', '\\2_\\1', colnames(res))
res
#  MeasA_A MeasB_A MeasA_B MeasB_B
#1       1       7       4      10
#2       2       8       5      11
#3       3       9       6      12

If we want to use dcast , we may need to create sequence column grouped by the 'name'. 如果要使用dcast ,则可能需要创建按“名称”分组的序列列。 Here, I am using dcast from the devel version of 'data.table' ie v1.9.5 as it can take multiple value.var columns. 在这里,我使用dcast从“data.table”即的开发人员版本v1.9.5 ,因为它可以采取多种value.var列。 Instructions to install the devel version are here . 安装说明版本的说明在here We convert the 'data.frame' to 'data.table' ( setDT(df) ), create the sequence column ('i1'), grouped by 'name', use dcast and specify the value.var columns. 我们将'data.frame'转换为'data.table'( setDT(df) ),创建序列列('i1'),按'name'分组,使用dcast并指定value.var列。

library(data.table)#v1.9.5+
setDT(df)[, i1:= 1:.N, by = name]
dcast(df, i1~name, value.var=c('MeasA', 'MeasB'))[, i1:= NULL][]
#   MeasA_A MeasA_B MeasB_A MeasB_B
#1:       1       4       7      10
#2:       2       5       8      11
#3:       3       6       9      12

In a similar way we can use the reshape from base R . 以类似的方式,我们可以使用以base Rbase Rreshape We create the sequence column using ave and use that as 'idvar in reshape`. 我们使用ave创建序列列,并将其用作“ reshape in idvar”。

df1 <- transform(df, i1= ave(seq_along(name), name, FUN=seq_along))
reshape(df1, idvar='i1', timevar='name', direction='wide')[-1]
#  MeasA.A MeasB.A MeasA.B MeasB.B
#1       1       7       4      10
#2       2       8       5      11
#3       3       9       6      12

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM