简体   繁体   English

根据年份的另一列中的特定条件创建变量

[英]Create a variable based on specific condition from another column based on Year

I would like to create a new column that displays the first year in V2 where the value of V3 appears.我想创建一个新列,显示 V2 中出现 V3 值的第一年。 However, after null values I would like to put again the first year for the reappearance of V3.但是,在 null 值之后,我想再次为 V3 的再现提供第一年。

That is, in possession of the following data:也就是说,拥有以下数据:

在此处输入图像描述

I would like to get a new V4 column as follows:我想获得一个新的 V4 列,如下所示:

在此处输入图像描述

I appreciate any help.我很感激任何帮助。

Below are the data:以下是数据:

structure(list(V1 = c(1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 
3, 3), V2 = c(2005, 2006, 2007, 2008, 2009, 2005, 2006, 2007, 
2008, 2009, 2005, 2006, 2007, 2008, 2009), V3 = c(0, 0, 10, 25, 
35, 12, 15, 0, 15, 17, 13, 0, 0, 15, 12)), row.names = c(NA, 
15L), class = "data.frame")

Using tidyverse and rleid from data.table you can try the following.使用tidyverse中的data.tablerleid ,您可以尝试以下操作。 You can group_by both V1 as well as a second group based on whether value in V3 is zero.您可以根据group_by中的值是否为零来对V1和第二组进行V3 This assumes the years are in chronological order (if not, may need to add arrange by V2 first).这假设年份是按时间顺序排列的(如果不是,可能需要先按V2添加arrange )。

library(tidyverse)
library(data.table)

df %>%
  group_by(V1, grp = rleid(V3 != 0)) %>%
  mutate(V4 = ifelse(V3 == 0, 0, first(V2))) %>%
  ungroup %>%
  select(-grp)

Output Output

      V1    V2    V3    V4
   <dbl> <dbl> <dbl> <dbl>
 1     1  2005     0     0
 2     1  2006     0     0
 3     1  2007    10  2007
 4     1  2008    25  2007
 5     1  2009    35  2007
 6     2  2005    12  2005
 7     2  2006    15  2005
 8     2  2007     0     0
 9     2  2008    15  2008
10     2  2009    17  2008
11     3  2005    13  2005
12     3  2006     0     0
13     3  2007     0     0
14     3  2008    15  2008
15     3  2009    12  2008

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM