Identifying breaks in consecutive values in R

Question

I have a data frame in R similar to the one below where the columns are year- and week number, and every row is a specific person. To get the relevant input data on the specific ID's I have an indicator of, whether the person was unemployed in 2015 or 2016 IND15 and IND16 . If the observation is '1' the person is unemployed, and if the observation is '0', the person is employed:

ID  y12_01  y12_02  y12_03  y12_04... y12_51  y12_52 y13_01 IND12 IND13  
01    1       1       1       0         0       1        1    1    1   
02    1       1       1       1         1       1        1    1    1   
03    0       0       1       1         0       0        1    1    1

As you see in the examples above, some of the rows shows unemployment in both 2012 and 2013. If the person has a sequence only of unemployment (only 1) beginning in 2015, I would like to create an indicator of this, and if the person has a 'break' in the sequence (ie ID01 or ID03), I would like to create an indicator of this.

I suspect part of the solution could include rowSums or a while-loop, but I have not had any luck so far. In SAS I think one would perhaps be able to use the array function, but once again I am not quite sure of how this would be done in R-language.

Answer 1

For the first part of the question, try df[df$IND15 == 1 & df$IND16 == 1, "Indicator1"] <- 1 .

For the second part, you should be able to do it with a for loop:

for (ID in df$ID){
  if (any(df[ID, 1:x]) == 0){
    df[ID, "Indicator2"] <- 1
  }
}

Answer 2

If you wish to retain the wide format, one way to create the indicator would be to multiply the columns. Using the following example data,

d <- read.table(text = "ID  y12_01  y12_02  y12_03  y12_04  y12_51  y12_52 y13_01 IND15 IND16  
01    1       1       1       0         0       1        1    1    1   
02    1       1       1       1         1       1        1    1    1   
03    0       0       1       1         0       0        1    1    1", 
  header = TRUE, stringsAsFactors = FALSE)

where the relevant columns are assumed to be columns 2 to 7, and the values are assumed to be numeric, we can create an indic column

d$indic <- Reduce(`*`, d[, 2:7])
d
#   ID y12_01 y12_02 y12_03 y12_04 y12_51 y12_52 y13_01 IND15 IND16 indic
# 1  1      1      1      1      0      0      1      1     1     1     0
# 2  2      1      1      1      1      1      1      1     1     1     1
# 3  3      0      0      1      1      0      0      1     1     1     0

Identifying breaks in consecutive values in R

Question

2 answers

solution1
0 2018-06-17 23:01:20

solution2
0 2018-06-18 00:26:42

Identifying breaks in consecutive values in R

Question

2 answers

solution1 0 2018-06-17 23:01:20

solution2 0 2018-06-18 00:26:42

solution1
0 2018-06-17 23:01:20

solution2
0 2018-06-18 00:26:42