Here I have an example of the ideal input data for time series analysis:
However I receive the raw data like this:
raw_data <- data.frame(matrix(nrow=4, ncol=5))
colnames(raw_data) <- c("site","date","00:00","01:00","02:00")
raw_data$site <- c("A","B","A","B")
raw_data$date <- c("2015-01-01","2015-01-01","2015-01-02","2015-01-02")
raw_data$`00:00` <- c(1,4,1,4)
raw_data$`01:00` <- c(2,5,2,5)
raw_data$`02:00` <- c(3,6,3,6)
I have spent really a lot of time trying to re-arrange the raw data into the ideal structure. Really appreciate any help. Thanks.
We can use pivot_longer
to reshape to 'long' format and then with unite
join the columns
library(dplyr)
library(tidyr)
library(lubridate)
raw_data %>%
pivot_longer(cols = matches('^[0-9]'), names_to = 'Time') %>%
unite(DateTime, date, Time, sep=" ") %>%
mutate(DateTime = ymd_hm(DateTime))
# A tibble: 12 x 3
# site DateTime value
# <chr> <dttm> <dbl>
# 1 A 2015-01-01 00:00:00 1
# 2 A 2015-01-01 01:00:00 2
# 3 A 2015-01-01 02:00:00 3
# 4 B 2015-01-01 00:00:00 4
# 5 B 2015-01-01 01:00:00 5
# 6 B 2015-01-01 02:00:00 6
# 7 A 2015-01-02 00:00:00 1
# 8 A 2015-01-02 01:00:00 2
# 9 A 2015-01-02 02:00:00 3
#10 B 2015-01-02 00:00:00 4
#11 B 2015-01-02 01:00:00 5
#12 B 2015-01-02 02:00:00 6
You can do this using melt
from the data.table
package:
library(data.table)
# Mark the data as a data.table
setDT(raw_data)
# Melt it into long format
new_data <- melt(raw_data, id.vars=c('site', 'date'), variable.name='time')
# Put date and time together into a new column, and delete the old ones
new_data[, `:=`(DateTime = paste(date, time),
date = NULL, time = NULL)]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.