[英]Create a variable based on specific condition from another column based on Year
I would like to create a new column that displays the first year in V2 where the value of V3 appears.我想创建一个新列,显示 V2 中出现 V3 值的第一年。 However, after null values I would like to put again the first year for the reappearance of V3.但是,在 null 值之后,我想再次为 V3 的再现提供第一年。
That is, in possession of the following data:也就是说,拥有以下数据:
I would like to get a new V4 column as follows:我想获得一个新的 V4 列,如下所示:
I appreciate any help.我很感激任何帮助。
Below are the data:以下是数据:
structure(list(V1 = c(1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3,
3, 3), V2 = c(2005, 2006, 2007, 2008, 2009, 2005, 2006, 2007,
2008, 2009, 2005, 2006, 2007, 2008, 2009), V3 = c(0, 0, 10, 25,
35, 12, 15, 0, 15, 17, 13, 0, 0, 15, 12)), row.names = c(NA,
15L), class = "data.frame")
Using tidyverse
and rleid
from data.table
you can try the following.使用tidyverse
中的data.table
和rleid
,您可以尝试以下操作。 You can group_by
both V1
as well as a second group based on whether value in V3
is zero.您可以根据group_by
中的值是否为零来对V1
和第二组进行V3
。 This assumes the years are in chronological order (if not, may need to add arrange
by V2
first).这假设年份是按时间顺序排列的(如果不是,可能需要先按V2
添加arrange
)。
library(tidyverse)
library(data.table)
df %>%
group_by(V1, grp = rleid(V3 != 0)) %>%
mutate(V4 = ifelse(V3 == 0, 0, first(V2))) %>%
ungroup %>%
select(-grp)
Output Output
V1 V2 V3 V4
<dbl> <dbl> <dbl> <dbl>
1 1 2005 0 0
2 1 2006 0 0
3 1 2007 10 2007
4 1 2008 25 2007
5 1 2009 35 2007
6 2 2005 12 2005
7 2 2006 15 2005
8 2 2007 0 0
9 2 2008 15 2008
10 2 2009 17 2008
11 3 2005 13 2005
12 3 2006 0 0
13 3 2007 0 0
14 3 2008 15 2008
15 3 2009 12 2008
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.