Collapsing dummy columns in R

Question

I have a tibble in which each row corresponds to a person. There are multiple rows per person, but each row contains the exact same data for each person, EXCEPT for the final several columns (below, "won", "lost") which contain 1/0 dummy variables. The values of the dummies vary across the rows.

Example dataframe:

df <- data.frame(name = c("Anne", "Anne", "Anne", "Joe", "Joe", "Joe", "Kyle", "Kyle", "Kyle", "Tom", "Tom", "Tom"), age = c("13", "13", "13", "15", "15", "15", "12", "12", "12", "14", "14", "14"), won = c(1,0,0,0,0,1,0,1,0,0,0,0), lost = c(0,1,0,0,1,0,1,0,0,0,1,0))

I would like to collapse the rows such that there is only one row for each person. In my collapsed dataframe, I would like the values of "won" and "lost" (the dummy columns) to be "1" for a person if that person had ANY "1"s in that column in the original dataset. Otherwise, I would like the value to be "0."

Collapsed dataframe:

df_collapsed <- data.frame(name = c("Anne", "Joe", "Kyle", "Tom"), age = c("13","15","12","14"), won = c(1,1,1,0), lost = c(1,0,1,1))

Please let me know if you have any ideas. I can't do this manually (as in the example) because my actual dataset is much larger. I have been thinking through this problem for some time but am unable to figure out how to collapse the dataframe accordingly.

Answer 1

We may use max after grouping

library(dplyr)
df %>%
   group_by(name, age) %>% 
   summarise(across(everything(), max), .groups = 'drop')

Or in base R

aggregate(. ~ name + age, df, max)

Collapsing dummy columns in R

Question

1 answers

solution1
1 ACCPTED 2022-04-01 15:50:25

Collapsing dummy columns in R

Question

1 answers

solution1 1 ACCPTED 2022-04-01 15:50:25

solution1
1 ACCPTED 2022-04-01 15:50:25