简体   繁体   中英

Merge data frame by count in R

I have two data frame below.

set.seed(12345)

df1 <- data.frame(
  y1 = sample(rep(c(0:1),length.out = 50)),
  y2 = sample(rep(c(0:1),length.out = 50)),
  y3 = sample(rep(c(0:1),length.out = 50)),
  y4 = sample(rep(c(0:1),length.out = 50)),
  y5 = sample(rep(c(0:1),length.out = 50)),
  y6 = sample(rep(c(0:1),length.out = 50))
)

df2 <- data.frame(x = c("y1","y2","y1:y2","y2:y3:y4","y5","y6"))

I want to merge this two data frame but the result of the merge will show the count of "1's" for each elemets. My other problem is that in the second data frame, some columns have more than one element separated by ":". This will make it hard for me to do this automatically. Below is the table I want to achieve

        x count
1       y1    25
2       y2    25
3    y1:y2    11
4 y2:y3:y4     8
5       y5    25
6       y6    25

We could get the column wise sum of 'df1' with colSums . Identify the elements of 'x' that has : using grep . Then, we split the 'x' column based on the index ('i1'), subset the 'df1' columns in each list element, use Reduce with & so that we get only TRUE when all the elements in the same row are 1. Get the sum , and create the 'count' column based on the 'v1' object created.

v1 <- colSums(df1)
i1 <- grep(':', df2$x)
v1[i1] <- sapply(strsplit(as.character(df2$x[i1]), ':'), 
           function(x) sum(Reduce(`&`,df1[x])))
df2$count <- v1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM