简体   繁体   English

在R中按计数合并数据帧

[英]Merge data frame by count in R

I have two data frame below. 我下面有两个数据框。

set.seed(12345)

df1 <- data.frame(
  y1 = sample(rep(c(0:1),length.out = 50)),
  y2 = sample(rep(c(0:1),length.out = 50)),
  y3 = sample(rep(c(0:1),length.out = 50)),
  y4 = sample(rep(c(0:1),length.out = 50)),
  y5 = sample(rep(c(0:1),length.out = 50)),
  y6 = sample(rep(c(0:1),length.out = 50))
)

df2 <- data.frame(x = c("y1","y2","y1:y2","y2:y3:y4","y5","y6"))

I want to merge this two data frame but the result of the merge will show the count of "1's" for each elemets. 我想合并这两个数据帧,但是合并的结果将显示每个要素的“ 1”计数。 My other problem is that in the second data frame, some columns have more than one element separated by ":". 我的另一个问题是,在第二个数据帧中,某些列具有多个用“:”分隔的元素。 This will make it hard for me to do this automatically. 这将使我很难自动执行此操作。 Below is the table I want to achieve 下面是我要实现的表

        x count
1       y1    25
2       y2    25
3    y1:y2    11
4 y2:y3:y4     8
5       y5    25
6       y6    25

We could get the column wise sum of 'df1' with colSums . 我们可以使用colSums来获取'df1'的列式和。 Identify the elements of 'x' that has : using grep . 识别的“x”,其具有的元素:使用grep Then, we split the 'x' column based on the index ('i1'), subset the 'df1' columns in each list element, use Reduce with & so that we get only TRUE when all the elements in the same row are 1. Get the sum , and create the 'count' column based on the 'v1' object created. 然后,我们根据索引('i1') split 'x'列,在每个list元素中对'df1'列进行子集化,使用带有& Reduce ,以便当同一行中的所有元素均为1时仅得到TRUE获取sum ,并基于创建的“ v1”对象创建“计数”列。

v1 <- colSums(df1)
i1 <- grep(':', df2$x)
v1[i1] <- sapply(strsplit(as.character(df2$x[i1]), ':'), 
           function(x) sum(Reduce(`&`,df1[x])))
df2$count <- v1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM