I'm trying to transform a data frame from long to wide format using the dcast function.
Here is the starting data frame:
convID var value
aa in 1
ab in 1
aa id 4/29/2014
ab id 4/20/2014
aa it Impr
ab it Impr
aa ic Display
ab ic Display
ab in 2
ab in 2
aa id 4/25/2014
ab id 4/24/2014
aa it Impr
ab it Click
aa ic Display
ab ic SEM
The desired data frame I want is, where the top half of the id
, it
, and ic
correspond to in=1
and bottom half of the id
, it
, and ic
correspond to in=2
:
convID in id it ic
aa 1 4/29/204 Impr Display
ab 1 4/20/204 Impr Display
aa 2 4/25/204 Impr Display
aa 2 4/24/204 Click SEM
However I'm not able to get the desired data frame using the dcast function. I tried many times and the closest I got was the following:
dcast(df,convID~var, value.var="value", fun.aggregate=max)
convID in id it ic
aa 2 4/29/204 Impr Display
aa 2 4/24/204 Impr SEM
This is obviously not right as it's returning max values of in
, id
, it
, and ic
and the proper assignments of in=1
and in=2
are disregarded. Additionally, I'm missing half my data. Any advise would be greatly appreciated!
#Here is code to produce the starting data frame:
convID<-c("aa", "ab", "aa", "ab", "aa", "ab", "aa", "ab", "aa", "ab", "aa", "ab", "aa", "ab", "aa", "ab")
var<-c("in", "in", "id", "id", "it", "it", "ic", "ic","in", "in", "id", "id", "it", "it", "ic", "ic")
value<-c("1", "1", "4/29/14", "4/20/14", "Impr", "Impr", "Display", "Display", "2", "2", "4/25/14", "4/24/14", "Impr", "Click", "Display", "SEM")
df<-data.frame(convID, var, value)
df$value<-as.character(df$value)
Your problem is that in
is not already a variable in your data frame (I changed the name to inval
because there are a few weirdnesses associated with trying to use a variable called in
inside within
).
I generated inval
by using zoo::na.locf
to set the value for each row to the last previously specified value:
library(zoo)
df <- within(df,{
inval <- ifelse(var=="in",value,NA)
inval <- na.locf(inval)
})
This results in:
str(df)
## 'data.frame': 16 obs. of 4 variables:
## $ convID: Factor w/ 2 levels "aa","ab": 1 2 1 2 1 2 1 2 1 2 ...
## $ var : Factor w/ 4 levels "ic","id","in",..: 3 3 2 2 4 4 1 1 3 3 ...
## $ value : chr "1" "1" "4/29/14" "4/20/14" ...
## $ inval : chr "1" "1" "1" "1" ...
Then it's easy to dcast
:
library(reshape2)
dcast(subset(df,var!="in"),convID+inval~...)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.