简体   繁体   中英

Lag/lead groups of observation with R

I have a data frame like this one

id year var
1  2000 0
1  2000 0
1  2001 1
1  2001 0
1  2001 0
1  2002 1
1  2002 0
1  2003 1
2  2004 0
2  2004 1
2  2005 0
2  2006 0
2  2006 0
2  2007 1

I want to group my dataframe by id and year and create a "leaded" variable by group. In other words I want to obtain this output

id year var leadvar
1  2000 0   1
1  2000 0   1
1  2001 1   0
1  2001 1   0
1  2001 1   0
1  2002 0   1
1  2002 0   1
1  2003 1   NA
2  2004 0   0
2  2004 0   0
2  2005 0   0
2  2006 0   1
2  2006 0   1
2  2007 1   NA

where leadvar is just the value taken by the variable var for the same id in the subsequent year. Can anyone help me with this?

Many thanks.

One dplyr option could be:

df %>%
 group_by(id) %>%
 mutate(lead_var = lead(var)) %>%
 group_by(id, year) %>%
 mutate(lead_var = last(lead_var))

      id  year   var lead_var
   <int> <int> <int>    <int>
 1     1  2000     0        1
 2     1  2000     0        1
 3     1  2001     1        0
 4     1  2001     1        0
 5     1  2001     1        0
 6     1  2002     0        1
 7     1  2002     0        1
 8     1  2003     1       NA
 9     2  2004     0        0
10     2  2004     0        0
11     2  2005     0        0
12     2  2006     0        1
13     2  2006     0        1
14     2  2007     1       NA

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM