简体   繁体   English

如何在多个行和列的不同数据框中找到两个值之间的差异?

[英]How do I find the difference between two values in different dataframes across multiple rows and columns?

I have two dataframes: 我有两个数据框:

df1: df1:

WAV    UV     VIOLET    BLUE
sD1    10.8   10.1      23.5
sA4    6.2    8.2       19.9
sA1    8.3    11.7      28.6
sC2    7.9    8.2       31.0
sC3    10.7   9.5       18.1

df2: df2:

ID    UV     VIOLET    BLUE
D1    7.9    10.1      19.3
D2    7.0    9.2       15.9
D3    21.4   20.7      27.4
D4    10.3   8.9       20.9
D5    21.7   16.5      21.3

I want to find the difference between the sum of the columns of D1 in df2 and the columns of each row in df1 and produce this output in a new dataframe. 我想找到df2中D1列的总和与df1中每一行的列之和,并在新的数据帧中生成此输出。 Then, this needs to be repeated for D2 of df2 with every row of df1 and so on. 然后,需要对df2的D2和df1的每一行重复此操作,依此类推。 Each new difference between the sums for each row should be a separate entry of the new dataframe and each list of the differences row of df2 should be a new row in the output. 每行总和之间的每个新差异都应是新数据帧的单独条目,而df2差异行的每个列表应是输出中的新行。 So the output should look like this: 因此输出应如下所示:

D1    sum(D1)-sum(sD1)  sum(D1)-sum(sA4)  sum(D1)-sum(sA1)  sum(D1-sC2)  sum(D1)-sum(sC3)
D2    sum(D2)-sum(sD1)  sum(D2)-sum(sA4)  sum(D2)-sum(sA1)  sum(D2-sC2)  sum(D2)-sum(sC3)
D3    sum(D3)-sum(sD1)  sum(D3)-sum(sA4)  sum(D3)-sum(sA1)  sum(D3-sC2)  sum(D3)-sum(sC3)
D4    sum(D4)-sum(sD1)  sum(D4)-sum(sA4)  sum(D4)-sum(sA1)  sum(D4-sC2)  sum(D4)-sum(sC3)
D5    sum(D5)-sum(sD1)  sum(D5)-sum(sA4)  sum(D5)-sum(sA1)  sum(D5-sC2)  sum(D5)-sum(sC3)

I'm open to any suggestions. 我愿意接受任何建议。

Here are three ways, two of which have already been mentioned by @Onyambu in comment. 这是三种方式,@ Onyambu在评论中已经提到其中两种。 Of these, the outer option seems to be the fastest. 其中, outer选项似乎是最快的。

outer(rowSums(df1[,-1]), rowSums(df2[,-1]), "-")

or 要么

sapply(rowSums(df1[-1]),"-",rowSums(df2[-1]))

or 要么

sapply(rowSums(df1[,-1]), function(x) x - rowSums(df2[,-1]))

So for instance, you can do: 因此,例如,您可以执行以下操作:

df <- data.frame(outer(rowSums(df1[,-1]), rowSums(df2[,-1]), "-"))
colnames(df) <- df1$WAV
rownames(df) <- df2$ID

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM