簡體   English   中英

如何將時間序列數據填充到數據框中?

[英]How to fill in time series data into a data frame?

我正在使用以下時間序列數據:

Weeks <- c("1995-01", "1995-02", "1995-03", "1995-04", "1995-06", "1995-08", "1995-10", "1995-15", "1995-16", "1995-24", "1995-32")
Country <- c("United States")
Values <- sample(seq(1,500,1), length(Weeks),  replace = T)

df <- data.frame(Weeks,Country, Values)

     Weeks       Country Values
1  1995-01 United States    193
2  1995-02 United States    183
3  1995-03 United States    402
4  1995-04 United States     75
5  1995-06 United States    402
6  1995-08 United States    436
7  1995-10 United States     97
8  1995-15 United States    445
9  1995-16 United States    336
10 1995-24 United States     31
11 1995-32 United States    413

它是根據年份和該年的周數(第 1 列)構建的。 請注意,如何省略幾個星期(作為聚合函數的結果)。 例如,不包括 1995-05。 如何將省略的行包含到數據中,添加適當的國家名稱,並為它們分配一個值 = 0?

謝謝您的幫助!

不同列中的separate年份和周值。 對於每個CountryYears ,我們complete缺失的周並將Values assign為 0。最后unite year 和 week 列以獲取與原始格式相同的數據。

library(dplyr)
library(tidyr)

df %>%
  separate(Weeks, c('Years', 'Weeks'), sep = '-', convert = TRUE) %>%
  group_by(Country, Years) %>%
  complete(Weeks = min(Weeks):max(Weeks), fill = list(Values = 0)) %>%
  ungroup() %>%
  mutate(Weeks = sprintf('%02d', Weeks)) %>%
  unite(Weeks, Years, Weeks, sep = '-')

#   Country       Weeks   Values
#   <chr>         <chr>    <dbl>
# 1 United States 1995-01    354
# 2 United States 1995-02    395
# 3 United States 1995-03    408
# 4 United States 1995-04    143
# 5 United States 1995-05      0
# 6 United States 1995-06    481
# 7 United States 1995-07      0
# 8 United States 1995-08     49
# 9 United States 1995-09      0
#10 United States 1995-10    229
# … with 22 more rows

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM