简体   繁体   English

在R中使用dcast将数据帧从长格式转换为宽格式不起作用

[英]Using dcast in R to transform data frame from long to wide format not working

I'm trying to transform a data frame from long to wide format using the dcast function. 我正在尝试使用dcast函数将数据帧从长格式转换为宽格式。

Here is the starting data frame: 这是起始数据帧:

convID     var      value
aa         in       1
ab         in       1
aa         id       4/29/2014
ab         id       4/20/2014
aa         it       Impr
ab         it       Impr
aa         ic       Display
ab         ic       Display
ab         in       2
ab         in       2
aa         id       4/25/2014
ab         id       4/24/2014
aa         it       Impr
ab         it       Click
aa         ic       Display
ab         ic       SEM

The desired data frame I want is, where the top half of the id , it , and ic correspond to in=1 and bottom half of the id , it , and ic correspond to in=2 : 我想要的所需数据帧是,其中iditic的上半部分对应于in=1 ,而iditic下半部分对应于in=2

convID     in     id           it       ic
aa         1      4/29/204     Impr     Display
ab         1      4/20/204     Impr     Display
aa         2      4/25/204     Impr     Display
aa         2      4/24/204     Click    SEM 

However I'm not able to get the desired data frame using the dcast function. 但是,我无法使用dcast函数获得所需的数据帧。 I tried many times and the closest I got was the following: 我尝试了很多次,得到的最接近的是:

dcast(df,convID~var, value.var="value", fun.aggregate=max)

convID     in     id          it       ic
aa         2      4/29/204    Impr     Display
aa         2      4/24/204    Impr     SEM 

This is obviously not right as it's returning max values of in , id , it , and ic and the proper assignments of in=1 and in=2 are disregarded. 这显然是不对的,因为它返回iniditic最大值,而忽略in=1in=2的正确分配。 Additionally, I'm missing half my data. 此外,我丢失了一半的数据。 Any advise would be greatly appreciated! 任何建议将不胜感激!

#Here is code to produce the starting data frame:
convID<-c("aa", "ab", "aa", "ab", "aa", "ab", "aa", "ab", "aa", "ab", "aa", "ab", "aa", "ab", "aa", "ab")  
var<-c("in", "in", "id", "id", "it", "it", "ic", "ic","in", "in", "id", "id", "it", "it", "ic", "ic")
value<-c("1", "1", "4/29/14", "4/20/14", "Impr", "Impr", "Display", "Display", "2", "2", "4/25/14", "4/24/14", "Impr", "Click", "Display", "SEM")
df<-data.frame(convID, var, value)
df$value<-as.character(df$value) 

Your problem is that in is not already a variable in your data frame (I changed the name to inval because there are a few weirdnesses associated with trying to use a variable called in inside within ). 您的问题是in尚未在数据框中成为变量(我将名称更改为inval因为尝试使用in内部within调用in的变量有一些怪异within )。

I generated inval by using zoo::na.locf to set the value for each row to the last previously specified value: 我通过使用zoo::na.locf将每行的值设置为最后一个先前指定的值来生成inval

library(zoo)
df <- within(df,{
    inval <- ifelse(var=="in",value,NA)
    inval <- na.locf(inval)
})

This results in: 结果是:

str(df)
## 'data.frame':    16 obs. of  4 variables:
##  $ convID: Factor w/ 2 levels "aa","ab": 1 2 1 2 1 2 1 2 1 2 ...
##  $ var   : Factor w/ 4 levels "ic","id","in",..: 3 3 2 2 4 4 1 1 3 3 ...
##  $ value : chr  "1" "1" "4/29/14" "4/20/14" ...
##  $ inval : chr  "1" "1" "1" "1" ...

Then it's easy to dcast : 然后很容易dcast

library(reshape2)
dcast(subset(df,var!="in"),convID+inval~...)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM