简体   繁体   English

使用不具有聚合功能的Reshape2在R中将DF从长到宽重塑

[英]Reshape DF from long to wide in R using Reshape2 without an aggregation function

a common task in the data I work with is reshaping client data from long to wide. 我使用的数据中的一项常见任务是将客户端数据从长整形重塑到宽整形。 I have a process to do this with Reshape outlined below that basically creates new (but unmodified) columns with a numeric index appended. 我有一个使用下面概述的Reshape来执行此操作的过程,该过程基本上会创建附加了数字索引的新(但未修改)列。 In my case I do not want to perform any modifications on the data. 就我而言,我不想对数据进行任何修改。 My question, because I often use reshape2 for other operations, is how this can be accomplished with dcast? 我的问题是,由于我经常将reshape2用于其他操作,因此如何使用dcast做到这一点? It does not seem that the example data need to be melted by id, for example, but I'm not sure how I would go about making it wide. 例如,似乎不需要通过id来融合示例数据,但是我不确定如何扩展它。 Would anyone be able to provide code in reshape2 to produce a frame comparable to "wide" in the example below? 在下面的示例中,谁能在reshape2中提供代码以产生与“宽”相当的框架?

Thanks. 谢谢。

Example

date_up   <- as.numeric(as.Date("1990/01/01"))
date_down <- as.numeric(as.Date("1960/01/01"))
ids <- data.frame(id=rep(1:1000, 3),site=rep(c("NMA", "NMB","NMC"), 1000))
ids <- ids[order(ids$id), ]
dates <-  data.frame(datelast=runif(3000, date_down, date_up),
          datestart=runif(3000, date_down, date_up),
          dateend=runif(3000, date_down, date_up),
          datemiddle=runif(3000, date_down, date_up))
dates[] <- lapply(dates[ , c("datestart", "dateend", "datemiddle")], 
             as.Date.numeric, origin = "1970-01-01")
df <- cbind(ids, dates)

# Make a within group index and reshape df
df$gid <- with(df, ave(rep(1, nrow(df)), df[,"id"], FUN = seq_along))
wide <- reshape(df, idvar = "id", timevar = "gid", direction = "wide")

We can use dcast from data.table , which can take multiple value.var columns. 我们可以使用dcastdata.table ,它可以使用多个value.var列。 Convert the 'data.frame' to 'data.table' ( setDT(df) ), use the dcast with formula and value.var specified. 转换'data.frame'到'data.table'( setDT(df)使用dcast与式和value.var指定。

library(data.table)
dcast(setDT(df), id~gid, value.var=names(df)[2:6])

NOTE: The data.table method would be faster compared to the reshape2 注: data.table方法会更快相比reshape2

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM