[英]make time series data frame in R
我獲得的數據缺少某些部分。
DAY<-c("2011-01-01","2011-01-02","2011-01-04","2011-01-06")
ITEM<-c("apple","apple","apple","banana")
sale<-c("yes","no","yes","yes")
value<-c(100,200,100,500)
df <- data.frame(day=DAY,item=ITEM,sale=sale,value=value)
day item sale value
1 2011-01-01 apple yes 100
2 2011-01-02 apple no 200
3 2011-01-04 apple yes 100
4 2011-01-06 banana yes 500
↑ 這是我的原始數據。 但我想跟隨展開數據框
bind_row=NULL
bind=NULL
for(h in 1:length(unique(df$day))){
bind_day=as.character(unique(df$day)[h])
for(i in 1:length(unique(df$item))){
bind_item=as.character(unique(df$item)[i])
for(j in 1:length(unique(df$sale))){
bind_sale=as.character(unique(df$sale)[j])
bind=c(bind_day,bind_item,bind_sale)
bind_row=rbind(bind_row,bind)
}
}
}
bind_row <- cbind(bind_row,c(100,0,0,0,0,200,0,0,100,0,0,0,0,0,500,0))
bind "2011-01-01" "apple" "yes" "100"
bind "2011-01-01" "apple" "no" "0"
bind "2011-01-01" "banana" "yes" "0"
bind "2011-01-01" "banana" "no" "0"
bind "2011-01-02" "apple" "yes" "0"
bind "2011-01-02" "apple" "no" "200"
bind "2011-01-02" "banana" "yes" "0"
bind "2011-01-02" "banana" "no" "0"
bind "2011-01-04" "apple" "yes" "100"
bind "2011-01-04" "apple" "no" "0"
bind "2011-01-04" "banana" "yes" "0"
bind "2011-01-04" "banana" "no" "0"
bind "2011-01-06" "apple" "yes" "0"
bind "2011-01-06" "apple" "no" "0"
bind "2011-01-06" "banana" "yes" "500"
bind "2011-01-06" "banana" "no" "0"
我怎樣才能在 R 中制作(從原始數據轉換到擴展數據)這個數據框? (或蟒蛇?)
這個腳本太慢了。 你可以幫幫我嗎 ?
謝謝你。
我們可以在 R 中使用complete
的tidyr
。
tidyr::complete(df, day, item, sale, fill = list(value = 0))
# day item sale value
# <fct> <fct> <fct> <dbl>
# 1 2011-01-01 apple no 0
# 2 2011-01-01 apple yes 100
# 3 2011-01-01 banana no 0
# 4 2011-01-01 banana yes 0
# 5 2011-01-02 apple no 200
# 6 2011-01-02 apple yes 0
# 7 2011-01-02 banana no 0
# 8 2011-01-02 banana yes 0
# 9 2011-01-04 apple no 0
#10 2011-01-04 apple yes 100
#11 2011-01-04 banana no 0
#12 2011-01-04 banana yes 0
#13 2011-01-06 apple no 0
#14 2011-01-06 apple yes 0
#15 2011-01-06 banana no 0
#16 2011-01-06 banana yes 500
建議的解決方案填充數據集中存在天數的行,但不填充沒有數據的天數。 為此,您需要tsibble
包。
library(tidyverse)
library(tsibble)
DAY <- c("2011-01-01", "2011-01-02", "2011-01-04", "2011-01-06")
ITEM <- c("apple", "apple", "apple", "banana")
sale <- c("yes", "no", "yes", "yes")
value <- c(100, 200, 100, 500)
df <- data.frame(day = as.Date(DAY), item = ITEM, sale = sale, value = value)
df %>%
complete(day, item, sale, fill=list(value=0)) %>%
as_tsibble(index=day, key=c(item,sale)) %>%
fill_gaps(value=0)
#> # A tsibble: 24 x 4 [1D]
#> # Key: item, sale [4]
#> day item sale value
#> <date> <fct> <fct> <dbl>
#> 1 2011-01-01 apple no 0
#> 2 2011-01-02 apple no 200
#> 3 2011-01-03 apple no 0
#> 4 2011-01-04 apple no 0
#> 5 2011-01-05 apple no 0
#> 6 2011-01-06 apple no 0
#> 7 2011-01-01 apple yes 100
#> 8 2011-01-02 apple yes 0
#> 9 2011-01-03 apple yes 0
#> 10 2011-01-04 apple yes 100
#> # … with 14 more rows
由reprex 包(v0.3.0) 於 2020 年 4 月 1 日創建
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.