简体   繁体   English

r data.table中所有行的条件和输出

[英]Conditional sum with output for all rows in r data.table

I have a coding issue what I think should be very easy. 我有一个编码问题,我认为应该很容易。 I have created a simplified dataset: 我创建了一个简化的数据集:

DT <- data.table(Bank=rep(c("a","b","c"),4),
                 Type=rep(c("Ass","Liab"),6),
                 Amount=c(100,200,300,400,200,300,400,500,200,100,300,100))
# Bank Type Amount SumLiab
# 1:    a  Ass    100      NA
# 2:    b Liab    200     700
# 3:    c  Ass    300      NA
# 4:    a Liab    400     500
# 5:    b  Ass    200      NA
# 6:    c Liab    300     400
# 7:    a  Ass    400      NA
# 8:    b Liab    500     700
# 9:    c  Ass    200      NA
# 10:    a Liab    100     500
# 11:    b  Ass    300      NA
# 12:    c Liab    100     400

I want to create a variable that is the sum of amount when Type = "Liab" per bank. 我想创建一个变量,它是每个银行的Type =“ Liab”时金额的总和。 So this is no problem: 所以这没问题:

DT[Type=='Liab',SumLiab:=sum(Amount),by=Bank]
# Bank Type Amount SumLiab
# 1:    a  Ass    100      NA
# 2:    b Liab    200     700
# 3:    c  Ass    300      NA
# 4:    a Liab    400     500
# 5:    b  Ass    200      NA
# 6:    c Liab    300     400
# 7:    a  Ass    400      NA
# 8:    b Liab    500     700
# 9:    c  Ass    200      NA
# 10:    a Liab    100     500
# 11:    b  Ass    300      NA
# 12:    c Liab    100     400

But I want this value for all rows, even when Type =='Ass'. 但是我希望所有行都具有该值,即使Type =='Ass'。 I understand that I now get NA due to the DT[Type=='Liab',..] restriction. 我知道由于DT[Type=='Liab',..]限制,我现在得到NA。 Is there a clever way of coding to get the value SumLiab for all rows? 是否有一种聪明的编码方式来获取所有行的值SumLiab? (So row1 that currently is NA for SumLiab gets the value 500) (因此,当前为SumLiab的NA的row1获得值500)

Thanks! 谢谢! Tim 蒂姆

When we are using Type=='Liab' in 'i', it is inserting the values only to that rows indexed by 'i'. 当我们在'i'中使用Type=='Liab'时,它仅将值插入到以'i'索引的行中。 We can subset the 'Amount' based on Type=='Liab' in 'j' and assign ( := ) it to be new variable. 我们可以基于'j'中的Type=='Liab'将'Amount'子集化,并将( := )分配为新变量。

 DT[, SumLiab:= sum(Amount[Type=='Liab']), by =Bank]
 DT
 #   Bank Type Amount SumLiab
 #1:    a  Ass    100     500
 #2:    b Liab    200     700
 #3:    c  Ass    300     400
 #4:    a Liab    400     500
 #5:    b  Ass    200     700
 #6:    c Liab    300     400
 #7:    a  Ass    400     500
 #8:    b Liab    500     700
 #9:    c  Ass    200     400
 #10:   a Liab    100     500
 #11:   b  Ass    300     700
 #12:   c Liab    100     400

No, I don't think that is correct. 不,我认为那是不对的。

You can try this: 您可以尝试以下方法:

DT[ SumLiab:=sum(Amount), by = list(Bank, Type)][]

Output of the code: 代码输出:

在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM