簡體   English   中英

從寬格式到長格式重塑時間序列數據

[英]Reshaping time series data from wide to long format

給定四個客戶端的時間序列數據,客戶端 1、2、3、4。

names(data)
"Date" "X1.CLIENT"   "X1seasonal"  "X1trend"  "X1remainder" "X2.CLIENT" 
  "X2seasonal"  "X2trend"     "X2remainder" "X3.CLIENT"   "X3seasonal"  
"X3trend"     "X3remainder" "X4.CLIENT"   "X4seasonal"  "X4trend"     
"X4remainder"

請注意,每個客戶數據之后是同一時期的季節性、趨勢和剩余部分。

我想以一種看起來像的方式重塑為長格式

"Date" "CLIENT_Number" "Type"  "Value"

[客戶端編號:1,2,3,4; 類型:客戶、季節性、趨勢、剩余]

(額外幫助理解:如果數據(寬格式)有 30 行/實例,那么我們必須轉換成長格式(30*4*4= 480 行/實例)

看起來您的數據是一種笨拙的重復寬格式。 正如ssdecontrol所說,使用reshape2中的melt,我會結合像lapply這樣的東西來分成幾個部分。 例如:

# Set up
install.packages("reshape2")
library(reshape2)

# Dummy data
d = data.frame(date=seq.Date(as.Date("2015-01-01"), length.out=100, by=1),
Client.1="A", trend.1=rnorm(100), remainder.1=rpois(10, 3), Client.2="B",
trend.2=rnorm(100), remainder.2=rpois(10, 3), Client.3="C", trend.3=rnorm(100),
remainder.3=rpois(10, 3))

# Split data by columns with client numbers
d = lapply(c(2,5,8), function(i){
    # Take columns relating to client i
    x = d[,c(1, seq(i, length.out=3, by=1))]
    # Rename so you have consistent factor labels
    names(x)[2:4] = c("client", "trend", "remainder")
    # Conververt from wide to long
    x = melt(x, id.vars=c("date", "client"))
})

# Make list into a data frame
d = do.call("rbind", d)

我發現這篇文章對寬格式和長格式非常清楚: http : //seananderson.ca/2013/10/19/reshape.html

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM