简体   繁体   English

在 R 中展开时间序列数据

[英]Expand Time-Series Data in R

I have a dataset of NHL players that includes number of goals for each player for every season that player played.我有一个 NHL 球员数据集,其中包括每个球员在每个赛季的进球数。 I calculate the running total of goals over a player's career in order to identify a "running" Top 10 players.我计算球员职业生涯的总进球数,以确定“跑步”前 10 名球员。

toy_data <- data.frame(player=c("gretzky","gretzky","gretzky","gretzky","gretzky","gretzky","gretzky","gretzky","gretzky","gretzky"),
                       goal_total=c(5,10,15,20,25,30,35,40,45,50),
                       goals=c(5,5,5,5,5,5,5,5,5,5),
                       year=c(1990,1991,1992,1993,1994,1995,1996,1997,1998,1999))

player goal_total goals year
1  gretzky          5     5 1990
2  gretzky         10     5 1991
3  gretzky         15     5 1992
4  gretzky         20     5 1993
5  gretzky         25     5 1994
6  gretzky         30     5 1995
7  gretzky         35     5 1996
8  gretzky         40     5 1997
9  gretzky         45     5 1998
10 gretzky         50     5 1999

I want to expand the dataset such that when players end their career, they remain in the dataset.我想扩展数据集,以便当球员结束职业生涯时,他们仍保留在数据集中。 For example, Wayne Gretzky retired in 1999, but I want an entry for Gretzky in the dataset for all subsequent years with his final goal total.例如,Wayne Gretzky 于 1999 年退休,但我希望在数据集中为 Gretzky 输入所有后续年份的最终目标总数。 The end product would look something like this:最终产品看起来像这样:

player goal_total goals year
1  gretzky          5     5 1990
2  gretzky         10     5 1991
3  gretzky         15     5 1992
4  gretzky         20     5 1993
5  gretzky         25     5 1994
6  gretzky         30     5 1995
7  gretzky         35     5 1996
8  gretzky         40     5 1997
9  gretzky         45     5 1998
10 gretzky         50     5 1999
11 gretzky         50     0 2000
12 gretzky         50     0 2001
13 gretzky         50     0 2002
...

and so on until 2019. Is there a simple way to do this?依此类推,直到 2019 年。有没有简单的方法可以做到这一点?

We can achieve this with complete and fill from tidyr我们可以通过tidyrcompletefill来实现这一点

library(dplyr)
library(tidyr)

toy_data %>%
  group_by(player) %>%
  complete(year = min(year):2019, fill = list(goals = 0)) %>%
  fill(goal_total) 


#    player year goal_total goals
#1  gretzky 1990          5     5
#2  gretzky 1991         10     5
#3  gretzky 1992         15     5
#4  gretzky 1993         20     5
#5  gretzky 1994         25     5
#6  gretzky 1995         30     5
#7  gretzky 1996         35     5
#8  gretzky 1997         40     5
#9  gretzky 1998         45     5
#10 gretzky 1999         50     5
#11 gretzky 2000         50     0
#12 gretzky 2001         50     0
#....

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM