[英]Expand Time-Series Data in R
我有一個 NHL 球員數據集,其中包括每個球員在每個賽季的進球數。 我計算球員職業生涯的總進球數,以確定“跑步”前 10 名球員。
toy_data <- data.frame(player=c("gretzky","gretzky","gretzky","gretzky","gretzky","gretzky","gretzky","gretzky","gretzky","gretzky"),
goal_total=c(5,10,15,20,25,30,35,40,45,50),
goals=c(5,5,5,5,5,5,5,5,5,5),
year=c(1990,1991,1992,1993,1994,1995,1996,1997,1998,1999))
player goal_total goals year
1 gretzky 5 5 1990
2 gretzky 10 5 1991
3 gretzky 15 5 1992
4 gretzky 20 5 1993
5 gretzky 25 5 1994
6 gretzky 30 5 1995
7 gretzky 35 5 1996
8 gretzky 40 5 1997
9 gretzky 45 5 1998
10 gretzky 50 5 1999
我想擴展數據集,以便當球員結束職業生涯時,他們仍保留在數據集中。 例如,Wayne Gretzky 於 1999 年退休,但我希望在數據集中為 Gretzky 輸入所有后續年份的最終目標總數。 最終產品看起來像這樣:
player goal_total goals year
1 gretzky 5 5 1990
2 gretzky 10 5 1991
3 gretzky 15 5 1992
4 gretzky 20 5 1993
5 gretzky 25 5 1994
6 gretzky 30 5 1995
7 gretzky 35 5 1996
8 gretzky 40 5 1997
9 gretzky 45 5 1998
10 gretzky 50 5 1999
11 gretzky 50 0 2000
12 gretzky 50 0 2001
13 gretzky 50 0 2002
...
依此類推,直到 2019 年。有沒有簡單的方法可以做到這一點?
我們可以通過tidyr
的complete
和fill
來實現這一點
library(dplyr)
library(tidyr)
toy_data %>%
group_by(player) %>%
complete(year = min(year):2019, fill = list(goals = 0)) %>%
fill(goal_total)
# player year goal_total goals
#1 gretzky 1990 5 5
#2 gretzky 1991 10 5
#3 gretzky 1992 15 5
#4 gretzky 1993 20 5
#5 gretzky 1994 25 5
#6 gretzky 1995 30 5
#7 gretzky 1996 35 5
#8 gretzky 1997 40 5
#9 gretzky 1998 45 5
#10 gretzky 1999 50 5
#11 gretzky 2000 50 0
#12 gretzky 2001 50 0
#....
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.