Merge Rows by ID and Date

Question

I am newbie at R and I have been searching on how to solve the following problem.

I have a df that looks like:

id------------Date ------------OB1------ OB2----- OB3
1 ------- 2017-01-01 --------- 1 --------- 0--------- 0
2 ------- 2006-01-05 --------- 1 --------- 0--------- 0
2 ------- 2007-04-19 --------- 0 --------- 1--------- 0
3 ------- 2015-02-23 --------- 0 --------- 0--------- 1
3 ------- 2015-02-23 --------- 1 --------- 0--------- 0

What I have to achieve is shown here:

id------------Date ------------OB1------ OB2----- OB3
1 ------- 2017-01-01 --------- 1 --------- 0--------- 0
2 ------- 2006-01-05 --------- 1 --------- 0--------- 0
2 ------- 2007-04-19 --------- 0 --------- 1--------- 0
3 ------- 2015-02-23 --------- 1 --------- 0--------- 1

This is, to combine rows, by id and date.

If there is value '1' for OB3 in a date and value '1' for OB1 in the same date (for the same ID) the result must be value '1' for OB1, value '1' for 'OB3' and a single date

I have been trying to apply some solutions explained here: Merge rows having same values in multiple columns

But it didn't work

EDIT: OB1, OB2, OBS3 are boolean values Thanks for your help!

EDIT 2: aggregate(. ~ ID + Date, df, any) works!

Sample data

Input Data

structure(list(ID = c(-1L, 1L, 1L), Date = c("2008-01-15", "2011-01-21", "2011-01-21"), `OBS1` = c(0, 0, 0), `OBS2` = c(0, 0, 0), `OBS3` = c(0, 0, 0), `OBS4` = c(0, 0, 0), `OBS5` = c(0, 0, 0), `OBS6` = c(0, 1, 0)), .Names = c("ID", "Date", "OBS1", "OBS2", "OBS3", "OBS4", "OBS5", "OBS6"), row.names = c(NA, 3L), class = "data.frame")

Output Data

structure(list(ID = c(-1L, 1L), Date = c("2008-01-15", "2011-01-21"), `OBS1` = c(FALSE, FALSE), `OBS2` = c(FALSE, FALSE), `OBS3` = c(FALSE, FALSE), `OBS4` = c(FALSE, FALSE), `OBS5` = c(FALSE, FALSE), `OBS6` = c(FALSE, TRUE)), .Names = c("ID", "Date", "OBS1", "OBS2", "OBS3", "OBS4", "OBS5", "OBS6"), row.names = c(NA, -2L), class = "data.frame")

Answer 1

The question already has been answered using base R's aggregate() function.

However, I felt challenged to turn the sample dataset as printed in the question into a reproducible example ( before the OP edited the question to include the results of dput() ).

In addition, the OP has mentioned he has a "very large df" which might be worthwhile to try a data.table approach.

Convert sample data into a dataframe

library(magrittr)
library(data.table)
df <- readr::read_file(
"id------------Date ------------OB1------ OB2----- OB3
1 ------- 2017-01-01 --------- 1 --------- 0--------- 0
2 ------- 2006-01-05 --------- 1 --------- 0--------- 0
2 ------- 2007-04-19 --------- 0 --------- 1--------- 0
3 ------- 2015-02-23 --------- 0 --------- 0--------- 1
3 ------- 2015-02-23 --------- 1 --------- 0--------- 0"
) %>% stringr::str_replace_all("[-]{2,}", " ") %>% 
  fread()
df

  id Date OB1 OB2 OB3 1: 1 2017-01-01 TRUE FALSE FALSE 2: 2 2006-01-05 TRUE FALSE FALSE 3: 2 2007-04-19 FALSE TRUE FALSE 4: 3 2015-02-23 FALSE FALSE TRUE 5: 3 2015-02-23 TRUE FALSE FALSE

Note that fread() has recognised automatically the boolean columns.

Aggregate

library(data.table)
setDT(df)[, lapply(.SD, any), by = .(id, Date)]

  id Date OB1 OB2 OB3 1: 1 2017-01-01 TRUE FALSE FALSE 2: 2 2006-01-05 TRUE FALSE FALSE 3: 2 2007-04-19 FALSE TRUE FALSE 4: 3 2015-02-23 TRUE FALSE TRUE

In case, the OP expects integer values 0 and 1 instead of logical values, these can be created in one go:

setDT(df)[, lapply(.SD, function(x) as.integer(any(x))), by = .(id, Date)]

  id Date OB1 OB2 OB3 1: 1 2017-01-01 1 0 0 2: 2 2006-01-05 1 0 0 3: 2 2007-04-19 0 1 0 4: 3 2015-02-23 1 0 1

Merge Rows by ID and Date

Question

Sample data

1 answers

solution1
3 ACCPTED 2018-02-12 13:16:20

Convert sample data into a dataframe

Aggregate

Merge Rows by ID and Date

Question

Sample data

1 answers

solution1 3 ACCPTED 2018-02-12 13:16:20

Convert sample data into a dataframe

Aggregate

solution1
3 ACCPTED 2018-02-12 13:16:20