[英]How to “unmelt” data with reshape r
I have a data frame that I melted using the reshape package that I would like to "un melt". 我有一个数据框,我使用reshape包融化,我想“解开”。
here is a toy example of the melted data (real data frame is 500x100 or larger) : 这是融化数据的玩具示例(实际数据帧为500x100或更大):
variable<-c(rep("X1",3),rep("X2",3),rep("X3",3))
value<-c(rep(rnorm(1,.5,.2),3),rep(rnorm(1,.5,.2),3),rep(rnorm(1,.5,.2),3))
dat <-data.frame(variable,value)
dat
variable value
1 X1 0.5285376
2 X1 0.5285376
3 X1 0.5285376
4 X2 0.1694908
5 X2 0.1694908
6 X2 0.1694908
7 X3 0.7446906
8 X3 0.7446906
9 X3 0.7446906
Each variable (X1, X2,X3) has values estimated at 3 different times (which in this toy example happen to be the same, but this is never the case). 每个变量(X1,X2,X3)具有在3个不同时间估计的值(在该玩具示例中恰好相同,但事实并非如此)。
I would like to get it (back) in the form of : 我希望以下列形式得到它(返回):
X1 X2 X3
1 0.5285376 0.1694908 0.7446906
2 0.5285376 0.1694908 0.7446906
3 0.5285376 0.1694908 0.7446906
Basically, I would like the variable column to be sorted on ID (X1, X2 etc) and become column headings. 基本上,我希望变量列在ID(X1,X2等)上排序并成为列标题。 I have tried various permutations of cast, dcast, recast, etc.. and cant seem to get the data in the format that I want.
我已经尝试了各种演员阵容,dcast,重铸等等。并且似乎无法以我想要的格式获取数据。 It was easy enough to 'melt' data from the wide form to the longer form (eg the dat datset), but getting it back is proving difficult.
很容易将数据从宽格式“融化”到更长的格式(例如dat数据集),但是将其恢复到原来很困难。 Any ideas?
有任何想法吗? I know this is relatively simple, but I am having a hard time conceptualizing how to do this in reshape or reshape2.
我知道这是相对简单的,但我很难概念化如何在reshape或reshape2中执行此操作。
Thanks, LP 谢谢,LP
I typically do this by creating an id column and then using dcast
: 我通常通过创建一个id列然后使用
dcast
来做到这dcast
:
> dat
variable value
1 X1 0.4299397
2 X1 0.4299397
3 X1 0.4299397
4 X2 0.2531551
5 X2 0.2531551
6 X2 0.2531551
7 X3 0.3972119
8 X3 0.3972119
9 X3 0.3972119
> dat$id <- rep(1:3,times = 3)
> dcast(data = dat,formula = id~variable,fun.aggregate = sum,value.var = "value")
id X1 X2 X3
1 1 0.4299397 0.2531551 0.3972119
2 2 0.4299397 0.2531551 0.3972119
3 3 0.4299397 0.2531551 0.3972119
Depending on how robust you need this to be , the following will correctly cast for varying number of occurrences of variables (and in any order). 根据您需要的强大程度,以下内容将针对不同数量的变量(以及任何顺序)正确转换。
> variable<-c(rep("X1",5),rep("X2",4),rep("X3",3))
> value<-c(rep(rnorm(1,.5,.2),5),rep(rnorm(1,.5,.2),4),rep(rnorm(1,.5,.2),3))
> dat <-data.frame(variable,value)
> dat <- dat[order(rnorm(nrow(dat))),]
> dat
variable value
11 X3 1.0294454
8 X2 0.6147509
2 X1 0.3537012
7 X2 0.6147509
9 X2 0.6147509
5 X1 0.3537012
4 X1 0.3537012
12 X3 1.0294454
3 X1 0.3537012
1 X1 0.3537012
10 X3 1.0294454
6 X2 0.6147509
> dat$id = numeric(nrow(dat))
> for (i in 1:nrow(dat)){
+ dat_temp <- dat[1:i,]
+ dat[i,]$id <- nrow(dat_temp[dat_temp$variable == dat[i,]$variable,])
+ }
> cast(dat, id~variable, value = 'value')
id X1 X2 X3
1 1 0.3537012 0.6147509 1.029445
2 2 0.3537012 0.6147509 1.029445
3 3 0.3537012 0.6147509 1.029445
4 4 0.3537012 0.6147509 NA
5 5 0.3537012 NA NA
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.