Merge data frame by count in R

Question

I have two data frame below.

set.seed(12345)

df1 <- data.frame(
  y1 = sample(rep(c(0:1),length.out = 50)),
  y2 = sample(rep(c(0:1),length.out = 50)),
  y3 = sample(rep(c(0:1),length.out = 50)),
  y4 = sample(rep(c(0:1),length.out = 50)),
  y5 = sample(rep(c(0:1),length.out = 50)),
  y6 = sample(rep(c(0:1),length.out = 50))
)

df2 <- data.frame(x = c("y1","y2","y1:y2","y2:y3:y4","y5","y6"))

I want to merge this two data frame but the result of the merge will show the count of "1's" for each elemets. My other problem is that in the second data frame, some columns have more than one element separated by ":". This will make it hard for me to do this automatically. Below is the table I want to achieve

        x count
1       y1    25
2       y2    25
3    y1:y2    11
4 y2:y3:y4     8
5       y5    25
6       y6    25

Answer 1

We could get the column wise sum of 'df1' with colSums . Identify the elements of 'x' that has : using grep . Then, we split the 'x' column based on the index ('i1'), subset the 'df1' columns in each list element, use Reduce with & so that we get only TRUE when all the elements in the same row are 1. Get the sum , and create the 'count' column based on the 'v1' object created.

v1 <- colSums(df1)
i1 <- grep(':', df2$x)
v1[i1] <- sapply(strsplit(as.character(df2$x[i1]), ':'), 
           function(x) sum(Reduce(`&`,df1[x])))
df2$count <- v1

Merge data frame by count in R

Question

1 answers

solution1
6 ACCPTED 2015-12-14 12:33:10

Merge data frame by count in R

Question

1 answers

solution1 6 ACCPTED 2015-12-14 12:33:10

solution1
6 ACCPTED 2015-12-14 12:33:10