[英]make time series data frame in R ask 2 (use dplyr??)
If I have store invoice data.如果我有存储发票数据。 missed data because no one sold.
因为没有人卖出,所以错过了数据。
missed date
day item sale value
1 2011-01-01 apple yes 100
2 2011-01-02 apple no 200
4 2011-01-06 banana yes 500
true calendar
day
1 2011-01-01
2 2011-01-02
3 2011-01-04
4 2011-01-05
5 2011-01-06
I need full data like used "tidyverse:::compleat func".我需要完整的数据,比如使用过的“tidyverse:::compleat func”。
true calendar
day
1 2011-01-01
2 2011-01-02
3 2011-01-04
4 2011-01-05
5 2011-01-06
I want to add Jan-4 and Jan-5 date.我想添加 Jan-4 和 Jan-5 日期。
bind "2011-01-01" "apple" "yes" "100"
bind "2011-01-01" "apple" "no" "0"
bind "2011-01-01" "banana" "yes" "0"
bind "2011-01-01" "banana" "no" "0"
bind "2011-01-02" "apple" "yes" "0"
bind "2011-01-02" "apple" "no" "200"
bind "2011-01-02" "banana" "yes" "0"
bind "2011-01-02" "banana" "no" "0"
bind "2011-01-04" "apple" "yes" "0"
bind "2011-01-04" "apple" "no" "0"
bind "2011-01-04" "banana" "yes" "0"
bind "2011-01-04" "banana" "no" "0"
bind "2011-01-05" "apple" "yes" "0"
bind "2011-01-05" "apple" "no" "0"
bind "2011-01-05" "banana" "yes" "0"
bind "2011-01-05" "banana" "no" "0"
bind "2011-01-06" "apple" "yes" "0"
bind "2011-01-06" "apple" "no" "0"
bind "2011-01-06" "banana" "yes" "500"
bind "2011-01-06" "banana" "no" "0"
how can I do that?我怎样才能做到这一点? in R language.
在 R 语言中。
We can use complete
to generate all dates from minimum day
to maximum value in day
and then right_join
it with calendar
to keep only the dates present in calendar
.我们可以使用
complete
来生成从最小值到最大值day
所有day
,然后将其与calendar
一起使用right_join
以仅保留 calendar 中存在的calendar
。
library(dplyr)
df %>%
mutate(day = as.Date(day)) %>%
tidyr::complete(item, sale, day = seq(min(day), max(day), by = 'day'),
fill = list(value = 0)) %>%
right_join(calendar %>% mutate(day = as.Date(day)), by = 'day')
# A tibble: 20 x 4
# item sale day value
# <fct> <fct> <date> <dbl>
# 1 apple no 2011-01-01 0
# 2 apple yes 2011-01-01 100
# 3 banana no 2011-01-01 0
# 4 banana yes 2011-01-01 0
# 5 apple no 2011-01-02 200
# 6 apple yes 2011-01-02 0
# 7 banana no 2011-01-02 0
# 8 banana yes 2011-01-02 0
# 9 apple no 2011-01-04 0
#10 apple yes 2011-01-04 0
#11 banana no 2011-01-04 0
#12 banana yes 2011-01-04 0
#13 apple no 2011-01-05 0
#14 apple yes 2011-01-05 0
#15 banana no 2011-01-05 0
#16 banana yes 2011-01-05 0
#17 apple no 2011-01-06 0
#18 apple yes 2011-01-06 0
#19 banana no 2011-01-06 0
#20 banana yes 2011-01-06 500
data数据
df <- structure(list(day = structure(1:3, .Label = c("2011-01-01",
"2011-01-02", "2011-01-06"), class = "factor"), item = structure(c(1L,
1L, 2L), .Label = c("apple", "banana"), class = "factor"), sale =
structure(c(2L, 1L, 2L), .Label = c("no", "yes"), class = "factor"),
value = c(100L, 200L, 500L)), class = "data.frame", row.names = c("1", "2", "4"))
calendar <- structure(list(day = structure(1:5, .Label = c("2011-01-01",
"2011-01-02", "2011-01-04", "2011-01-05", "2011-01-06"), class =
"factor")), class = "data.frame", row.names = c("1", "2", "3", "4", "5"))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.