[英]Match row and column, then subtracting a value
I am reconciling two data sets.我正在协调两个数据集。
A
has a list of transactions and a value. A
有一个交易列表和一个值。 B
contains several values from after a process. B
包含来自进程之后的多个值。 I want to subtract the values in A
from a the identified field in B
.我想从
B
中的已识别字段中减去A
中的值。
library(tidyverse)
A<-tribble(
~idA, ~group, ~column, ~value, ~idB,
1, "x", "t1", 11, 1,
2, "x", "t1", 22, 3,
3, "x", "t3", 33, 4,
4, "x", "t1", 25, 5)
B<-tribble(
~idB, ~group, ~t1, ~t2, ~t3,
1, "x", 11, 0, 0,
2, "x", 0, 11, 0,
3, "x", 22, 0, 0 ,
4, "x", 0, 0, 33,
5, "x", 50, 50, 50)
Desired output:所需的 output:
Boutput<-tribble(
~idB, ~g,~t1, ~t2, ~t3,
1, "x", 0, 0, 0,
2, "x", 0, 11, 0,
3, "x", 0, 0, 0,
4, "x", 0, 0, 0,
5, "x", 25, 50, 50)
I've tried inner_joining then mutating based on rules.我尝试过 inner_joining 然后根据规则进行变异。
How to mathematically subtract the matches?如何在数学上减去匹配项?
I was hesitating about posting this, but thought it might be helpful in looking at some alternative solutions.我对发布此内容犹豫不决,但认为这可能有助于寻找一些替代解决方案。
I might consider converting A
from long to wide first:我可能会考虑先将
A
从长转换为宽:
Awide <- A %>%
pivot_wider(names_from = column)
R> Awide
# A tibble: 4 x 5
idA group idB t1 t3
<dbl> <chr> <dbl> <dbl> <dbl>
1 1 x 1 11 NA
2 2 x 3 22 NA
3 3 x 4 NA 33
4 4 x 5 25 NA
In this case, there are no values for t2
.在这种情况下,
t2
没有值。 Before joining A
and B
, would make sure there are columns for all 3 ( t1
, t2
, t3
):在加入
A
和B
之前,请确保所有 3 列( t1
、 t2
、 t3
)都有列:
cols <- c("idA", "group", "idB", "t1", "t2", "t3")
missing <- setdiff(cols, names(Awide))
Awide[missing] <- NA
Awide <- Awide[cols]
R> Awide
# A tibble: 4 x 6
idA group idB t1 t2 t3
<dbl> <chr> <dbl> <dbl> <lgl> <dbl>
1 1 x 1 11 NA NA
2 2 x 3 22 NA NA
3 3 x 4 NA NA 33
4 4 x 5 25 NA NA
Then could do a left_join
and make sure all the NAs
present are zero for subtraction later.然后可以做一个
left_join
并确保所有存在的NAs
都为零,以便稍后进行减法。
AB <- left_join(B, Awide, by=c("idB", "group")) %>%
mutate_at(c("t1.y", "t2.y", "t3.y"), ~replace(., is.na(.), 0))
R> AB
# A tibble: 5 x 9
idB group t1.x t2.x t3.x idA t1.y t2.y t3.y
<dbl> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 x 11 0 0 1 11 0 0
2 2 x 0 11 0 NA 0 0 0
3 3 x 22 0 0 2 22 0 0
4 4 x 0 0 33 3 0 0 33
5 5 x 50 50 50 4 25 0 0
Then would do the subtraction on the columns that match the pattern t*.x
and t*.y
(alternatives could be used depending on what you need):然后将对匹配模式
t*.x
和t*.y
的列进行减法(可以根据您的需要使用替代方案):
tdiff <- AB[,grepl("^t.*\\.x$", names(AB))] - AB[,grepl("^t.*\\.y$", names(AB))]
R> tdiff
t1.x t2.x t3.x
1 0 0 0
2 0 11 0
3 0 0 0
4 0 0 0
5 25 50 50
Then bind these totals to AB
to get final result:然后将这些总数绑定到
AB
以获得最终结果:
cbind(AB[,1:2,drop=FALSE], tdiff)
idB group t1.x t2.x t3.x
1 1 x 0 0 0
2 2 x 0 11 0
3 3 x 0 0 0
4 4 x 0 0 0
5 5 x 25 50 50
This is the loop I've come up with这是我想出的循环
Bout<-B
for (i in A$idA){
Bout[A$idB[i],A$column[i]] <- (as.numeric(Bout[A$idB[i],A$column[i]])) - A$value[i]
}
Bout
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.