[英]Using dcast in R to transform data frame from long to wide format not working
I'm trying to transform a data frame from long to wide format using the dcast function. 我正在尝试使用dcast函数将数据帧从长格式转换为宽格式。
Here is the starting data frame: 这是起始数据帧:
convID var value
aa in 1
ab in 1
aa id 4/29/2014
ab id 4/20/2014
aa it Impr
ab it Impr
aa ic Display
ab ic Display
ab in 2
ab in 2
aa id 4/25/2014
ab id 4/24/2014
aa it Impr
ab it Click
aa ic Display
ab ic SEM
The desired data frame I want is, where the top half of the id
, it
, and ic
correspond to in=1
and bottom half of the id
, it
, and ic
correspond to in=2
: 我想要的所需数据帧是,其中id
, it
和ic
的上半部分对应于in=1
,而id
, it
和ic
下半部分对应于in=2
:
convID in id it ic
aa 1 4/29/204 Impr Display
ab 1 4/20/204 Impr Display
aa 2 4/25/204 Impr Display
aa 2 4/24/204 Click SEM
However I'm not able to get the desired data frame using the dcast function. 但是,我无法使用dcast函数获得所需的数据帧。 I tried many times and the closest I got was the following: 我尝试了很多次,得到的最接近的是:
dcast(df,convID~var, value.var="value", fun.aggregate=max)
convID in id it ic
aa 2 4/29/204 Impr Display
aa 2 4/24/204 Impr SEM
This is obviously not right as it's returning max values of in
, id
, it
, and ic
and the proper assignments of in=1
and in=2
are disregarded. 这显然是不对的,因为它返回in
, id
, it
和ic
最大值,而忽略in=1
和in=2
的正确分配。 Additionally, I'm missing half my data. 此外,我丢失了一半的数据。 Any advise would be greatly appreciated! 任何建议将不胜感激!
#Here is code to produce the starting data frame:
convID<-c("aa", "ab", "aa", "ab", "aa", "ab", "aa", "ab", "aa", "ab", "aa", "ab", "aa", "ab", "aa", "ab")
var<-c("in", "in", "id", "id", "it", "it", "ic", "ic","in", "in", "id", "id", "it", "it", "ic", "ic")
value<-c("1", "1", "4/29/14", "4/20/14", "Impr", "Impr", "Display", "Display", "2", "2", "4/25/14", "4/24/14", "Impr", "Click", "Display", "SEM")
df<-data.frame(convID, var, value)
df$value<-as.character(df$value)
Your problem is that in
is not already a variable in your data frame (I changed the name to inval
because there are a few weirdnesses associated with trying to use a variable called in
inside within
). 您的问题是in
尚未在数据框中成为变量(我将名称更改为inval
因为尝试使用in
内部within
调用in
的变量有一些怪异within
)。
I generated inval
by using zoo::na.locf
to set the value for each row to the last previously specified value: 我通过使用zoo::na.locf
将每行的值设置为最后一个先前指定的值来生成inval
:
library(zoo)
df <- within(df,{
inval <- ifelse(var=="in",value,NA)
inval <- na.locf(inval)
})
This results in: 结果是:
str(df)
## 'data.frame': 16 obs. of 4 variables:
## $ convID: Factor w/ 2 levels "aa","ab": 1 2 1 2 1 2 1 2 1 2 ...
## $ var : Factor w/ 4 levels "ic","id","in",..: 3 3 2 2 4 4 1 1 3 3 ...
## $ value : chr "1" "1" "4/29/14" "4/20/14" ...
## $ inval : chr "1" "1" "1" "1" ...
Then it's easy to dcast
: 然后很容易dcast
:
library(reshape2)
dcast(subset(df,var!="in"),convID+inval~...)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.