[英]How to fill in time series data into a data frame?
我正在使用以下時間序列數據:
Weeks <- c("1995-01", "1995-02", "1995-03", "1995-04", "1995-06", "1995-08", "1995-10", "1995-15", "1995-16", "1995-24", "1995-32")
Country <- c("United States")
Values <- sample(seq(1,500,1), length(Weeks), replace = T)
df <- data.frame(Weeks,Country, Values)
Weeks Country Values
1 1995-01 United States 193
2 1995-02 United States 183
3 1995-03 United States 402
4 1995-04 United States 75
5 1995-06 United States 402
6 1995-08 United States 436
7 1995-10 United States 97
8 1995-15 United States 445
9 1995-16 United States 336
10 1995-24 United States 31
11 1995-32 United States 413
它是根據年份和該年的周數(第 1 列)構建的。 請注意,如何省略幾個星期(作為聚合函數的結果)。 例如,不包括 1995-05。 如何將省略的行包含到數據中,添加適當的國家名稱,並為它們分配一個值 = 0?
謝謝您的幫助!
不同列中的separate
年份和周值。 對於每個Country
和Years
,我們complete
缺失的周並將Values
assign
為 0。最后unite
year 和 week 列以獲取與原始格式相同的數據。
library(dplyr)
library(tidyr)
df %>%
separate(Weeks, c('Years', 'Weeks'), sep = '-', convert = TRUE) %>%
group_by(Country, Years) %>%
complete(Weeks = min(Weeks):max(Weeks), fill = list(Values = 0)) %>%
ungroup() %>%
mutate(Weeks = sprintf('%02d', Weeks)) %>%
unite(Weeks, Years, Weeks, sep = '-')
# Country Weeks Values
# <chr> <chr> <dbl>
# 1 United States 1995-01 354
# 2 United States 1995-02 395
# 3 United States 1995-03 408
# 4 United States 1995-04 143
# 5 United States 1995-05 0
# 6 United States 1995-06 481
# 7 United States 1995-07 0
# 8 United States 1995-08 49
# 9 United States 1995-09 0
#10 United States 1995-10 229
# … with 22 more rows
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.