繁体   English   中英

从宽格式到长格式重塑时间序列数据

[英]Reshaping time series data from wide to long format

给定四个客户端的时间序列数据,客户端 1、2、3、4。

names(data)
"Date" "X1.CLIENT"   "X1seasonal"  "X1trend"  "X1remainder" "X2.CLIENT" 
  "X2seasonal"  "X2trend"     "X2remainder" "X3.CLIENT"   "X3seasonal"  
"X3trend"     "X3remainder" "X4.CLIENT"   "X4seasonal"  "X4trend"     
"X4remainder"

请注意,每个客户数据之后是同一时期的季节性、趋势和剩余部分。

我想以一种看起来像的方式重塑为长格式

"Date" "CLIENT_Number" "Type"  "Value"

[客户端编号:1,2,3,4; 类型:客户、季节性、趋势、剩余]

(额外帮助理解:如果数据(宽格式)有 30 行/实例,那么我们必须转换成长格式(30*4*4= 480 行/实例)

看起来您的数据是一种笨拙的重复宽格式。 正如ssdecontrol所说,使用reshape2中的melt,我会结合像lapply这样的东西来分成几个部分。 例如:

# Set up
install.packages("reshape2")
library(reshape2)

# Dummy data
d = data.frame(date=seq.Date(as.Date("2015-01-01"), length.out=100, by=1),
Client.1="A", trend.1=rnorm(100), remainder.1=rpois(10, 3), Client.2="B",
trend.2=rnorm(100), remainder.2=rpois(10, 3), Client.3="C", trend.3=rnorm(100),
remainder.3=rpois(10, 3))

# Split data by columns with client numbers
d = lapply(c(2,5,8), function(i){
    # Take columns relating to client i
    x = d[,c(1, seq(i, length.out=3, by=1))]
    # Rename so you have consistent factor labels
    names(x)[2:4] = c("client", "trend", "remainder")
    # Conververt from wide to long
    x = melt(x, id.vars=c("date", "client"))
})

# Make list into a data frame
d = do.call("rbind", d)

我发现这篇文章对宽格式和长格式非常清楚: http : //seananderson.ca/2013/10/19/reshape.html

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM