簡體   English   中英

在 R 中制作時間序列數據框

[英]make time series data frame in R

我獲得的數據缺少某些部分。

DAY<-c("2011-01-01","2011-01-02","2011-01-04","2011-01-06")
ITEM<-c("apple","apple","apple","banana")
sale<-c("yes","no","yes","yes")
value<-c(100,200,100,500)

df <- data.frame(day=DAY,item=ITEM,sale=sale,value=value)


         day   item sale value
1 2011-01-01  apple  yes   100
2 2011-01-02  apple   no   200
3 2011-01-04  apple  yes   100
4 2011-01-06 banana  yes   500

↑ 這是我的原始數據。 但我想跟隨展開數據框


bind_row=NULL
bind=NULL
for(h in 1:length(unique(df$day))){
  bind_day=as.character(unique(df$day)[h])

  for(i in 1:length(unique(df$item))){
    bind_item=as.character(unique(df$item)[i])
  for(j in 1:length(unique(df$sale))){
    bind_sale=as.character(unique(df$sale)[j])
    bind=c(bind_day,bind_item,bind_sale)
    bind_row=rbind(bind_row,bind)
  }
}
}

bind_row <- cbind(bind_row,c(100,0,0,0,0,200,0,0,100,0,0,0,0,0,500,0))


bind "2011-01-01" "apple"  "yes" "100"
bind "2011-01-01" "apple"  "no"  "0"  
bind "2011-01-01" "banana" "yes" "0"  
bind "2011-01-01" "banana" "no"  "0"  
bind "2011-01-02" "apple"  "yes" "0"  
bind "2011-01-02" "apple"  "no"  "200"
bind "2011-01-02" "banana" "yes" "0"  
bind "2011-01-02" "banana" "no"  "0"  
bind "2011-01-04" "apple"  "yes" "100"
bind "2011-01-04" "apple"  "no"  "0"  
bind "2011-01-04" "banana" "yes" "0"  
bind "2011-01-04" "banana" "no"  "0"  
bind "2011-01-06" "apple"  "yes" "0"  
bind "2011-01-06" "apple"  "no"  "0"  
bind "2011-01-06" "banana" "yes" "500"
bind "2011-01-06" "banana" "no"  "0"  

我怎樣才能在 R 中制作(從原始數據轉換到擴展數據)這個數據框? (或蟒蛇?)

這個腳本太慢了。 你可以幫幫我嗎 ?

謝謝你。

我們可以在 R 中使用completetidyr

tidyr::complete(df, day, item, sale, fill = list(value = 0))

#    day        item   sale  value
#   <fct>      <fct>  <fct> <dbl>
# 1 2011-01-01 apple  no        0
# 2 2011-01-01 apple  yes     100
# 3 2011-01-01 banana no        0
# 4 2011-01-01 banana yes       0
# 5 2011-01-02 apple  no      200
# 6 2011-01-02 apple  yes       0
# 7 2011-01-02 banana no        0
# 8 2011-01-02 banana yes       0
# 9 2011-01-04 apple  no        0
#10 2011-01-04 apple  yes     100
#11 2011-01-04 banana no        0
#12 2011-01-04 banana yes       0
#13 2011-01-06 apple  no        0
#14 2011-01-06 apple  yes       0
#15 2011-01-06 banana no        0
#16 2011-01-06 banana yes     500

建議的解決方案填充數據集中存在天數的行,但不填充沒有數據的天數。 為此,您需要tsibble包。

library(tidyverse)
library(tsibble)

DAY <- c("2011-01-01", "2011-01-02", "2011-01-04", "2011-01-06")
ITEM <- c("apple", "apple", "apple", "banana")
sale <- c("yes", "no", "yes", "yes")
value <- c(100, 200, 100, 500)

df <- data.frame(day = as.Date(DAY), item = ITEM, sale = sale, value = value)

df %>%
  complete(day, item, sale, fill=list(value=0)) %>%
  as_tsibble(index=day, key=c(item,sale)) %>%
  fill_gaps(value=0)
#> # A tsibble: 24 x 4 [1D]
#> # Key:       item, sale [4]
#>    day        item  sale  value
#>    <date>     <fct> <fct> <dbl>
#>  1 2011-01-01 apple no        0
#>  2 2011-01-02 apple no      200
#>  3 2011-01-03 apple no        0
#>  4 2011-01-04 apple no        0
#>  5 2011-01-05 apple no        0
#>  6 2011-01-06 apple no        0
#>  7 2011-01-01 apple yes     100
#>  8 2011-01-02 apple yes       0
#>  9 2011-01-03 apple yes       0
#> 10 2011-01-04 apple yes     100
#> # … with 14 more rows

reprex 包(v0.3.0) 於 2020 年 4 月 1 日創建

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM