I am working with a data frame that consists of participants' data across multiple timepoints. I am attempting to convert the data frame from a long format to a wide format. The data frame consists of variables belonging to different data types such as date and numerics.
library(data.table)
SN <- c("AAA", "BBB", "BBB", "CCC", "DDD", "EEE", "DDD")
Timepoint <- c(1, 1, 2, 1, 1, 1, 2)
date <- c("31-Mar-17", "08-Mar-17", "31-Mar-18", "28-Mar-18", "17-Mar-17", "26-Feb-18", "07-Apr-18")
score <- c(13, 16, 17, 9, 14, 15, 15)
age <- c(12, 15, 16, 9, 14, 14, 15)
df <- data.frame(SN, Timepoint, date, score, age)
df$date <- as.Date(df$date, format = "%d-%B-%y")
I used the following code to convert the data from a long format to a wide format:
df2 <- dcast(melt(df, id.vars = c("SN", "Timepoint")),
SN ~ Timepoint + variable, value.var = "value")
As R interprets all variables to belong to a common type (numeric), the date variable has been incorrectly converted to a numeric variable.
The following is the incorrect output I have obtained:
The correct output which I am trying to achieve is as follow:
Thanks! Help much appreciated!
We may need to have 'date' also in the id.vars
as the 'value' column in numeric
and by mixing two classes, it converts to a single one ie the numeric one. Instead, if we have two separate columns and make use of the value.var
from data.table::dcast
(takes more than one variable)
dcast(melt(setDT(df), id.vars = c("SN", "Timepoint", "date")),
SN ~ Timepoint + variable, value.var = c("date", "value"))
Based on the expected output, we may only need dcast
dcast(setDT(df)[], SN ~ Timepoint, value.var = c('date', 'score', 'age'))
Another way to do this is by converting the date variables to a date after the data is restructured, instead of before. The dplyr package makes it easy to change all columns based on the name, such as columns that end with date.
library(data.table)
library(dplyr)
SN <- c("AAA", "BBB", "BBB", "CCC", "DDD", "EEE", "DDD")
Timepoint <- c(1, 1, 2, 1, 1, 1, 2)
date <- c("31-Mar-17", "08-Mar-17", "31-Mar-18", "28-Mar-18", "17-Mar-17", "26-Feb-18", "07-Apr-18")
score <- c(13, 16, 17, 9, 14, 15, 15)
age <- c(12, 15, 16, 9, 14, 14, 15)
df <- data.frame(SN, Timepoint, date, score, age)
df2 <- dcast(melt(df, id.vars = c("SN", "Timepoint")),
SN ~ Timepoint + variable, value.var = "value") %>%
mutate_at(vars(ends_with("date")), as.Date, format = "%d-%B-%y")
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.