在R中的条件下，将一个data.frame中的列值乘以另一个data.frame中的列

Question

I have two data frames in r that I am trying to combine based on the values in a column for each. 我在r中有两个数据框，试图根据每个列中的值进行合并。

df1=data.frame(comp=c("comp1", "comp2", "comp3","comp1"),
 state1=c(1,0,0,1),
 state2=c(1,1,0,1),
 state3=c(0,1,1,0),
 state4=c(0,0,1,0),year=c(1,1,1,2))

   comp state1 state2 state3 state4 year
1 comp1      1      1      0      0    1
2 comp2      0      1      1      0    1
3 comp3      0      0      1      1    1
4 comp1      1      1      0      0    2

df2=data.frame(state=c("state1","state2", "state3", "state4", 
                       "state1","state2", "state3", "state4"), 
 var1=c(1,0,0,1,0,0,1,1), 
 var2=c(0,1,0,0,0,1,1,0), 
 year=c(1,1,1,1,2,2,2,2))

df2 df2

    state var1 var2 year
1 state1    1    0    1
2 state2    0    1    1
3 state3    0    0    1
4 state4    1    0    1
5 state1    0    1    2
6 state2    0    1    2
7 state3    1    1    2
8 state4    1    0    2

I'd like to append columns to df1 that are var1, var2 which is the mean of all states for that comp. 我想在df1后面加上var1，var2列，这些列是该comp的所有状态的平均值。

so, var1 for comp1 should be 1*1+1*0+0*0+0*1/(1+1) or state*var/sum(state for comp) by year. 因此，comp1的var1应该按年份为1 * 1 + 1 * 0 + 0 * 0 + 0 * 1 /（1 + 1）或state * var / sum（comp的状态）。

df3 would look like: df3看起来像：

         state1 state2 state3 state4 year  var1  var2
    1 comp1  1      1      0      0     1   0.5   0.5
    2 comp2  0      1      1      0     1   0.0   0.5
    3 comp3  0      0      1      1     1   0.5   0.0
    4 comp1  1      1      0      0     2   0.5   1.0

Is this possible? 这可能吗？ I tried to use ddply with mean of var1, summarizing by comp and year, but that doesn't work. 我试图将ddply与var1的平均值一起使用，并按comp和year进行汇总，但这是行不通的。 I end up with more than one row per comp per year. 我最终每年每场比赛要排一排以上。

Thanks in advance. 提前致谢。 This one is the most similar to my problem, but it doesn't show a conditional in the second data set. 这与我的问题最相似，但是在第二个数据集中没有显示条件。 Multiply various subsets of a data frame by different vectors 数据帧的各个子集乘以不同的向量

Please advise. 请指教。

Answer 1

My hope is that by breaking this into segments you can find out why my results look different than your prediction: 我希望通过将其分成几个部分，可以找出为什么我的结果看起来与您的预测不同的原因：

 df3 <- matrix(NA, ncol=2, nrow=nrow(df1))
 for (i in seq(nrow(df1))) {
     df3[i, 1] <- sum(df2[ df2$year==df1$year[i], "var1"] * df1[i, 2:5])
     df3[i, 2] <- sum(df2[ df2$year==df1$year[i], "var2"] * df1[i, 2:5])
 }
 m4<-df3/rowSums(df1[2:5])
 cbind(df1, m4)
#---------------
   comp state1 state2 state3 state4 year   1         2
1 comp1      1      1      0      0    1 0.5 0.5000000
2 comp2      0      1      1      0    1 0.0 0.3333333
3 comp3      0      0      1      1    1 0.5 0.0000000
4 comp1      1      1      0      0    2 0.0 0.3333333

Seems to match up ok on "var1" entries and I'm hoping you just threw in some guesses for "var2". 似乎可以匹配“ var1”条目了，我希望您只是对“ var2”有所猜想。

在R中的条件下，将一个data.frame中的列值乘以另一个data.frame中的列

问题描述

1 个解决方案

解决方案1
1 已采纳 2012-03-09 20:43:54

在R中的条件下，将一个data.frame中的列值乘以另一个data.frame中的列

问题描述

1 个解决方案

解决方案1 1 已采纳 2012-03-09 20:43:54

解决方案1
1 已采纳 2012-03-09 20:43:54