[英]Sum of columns based on range of values of other columns in a Pandas dataframe
[英]Get sum of two columns based on conditions of other columns in a Pandas Dataframe
我有以下数据框:
data = {"Subject":["1","2","3","3","4","5","5"],
"date": ["2020-05-01 16:54:25","2020-05-03 10:31:18","2020-05-08 10:10:40","2020-05-08 10:10:42","2020-05-06 09:30:40","2020-05-07 12:46:30","2020-05-07 12:55:10"],
"Accept": ["True","False","True","True","False","True","True"],
"Amount" : [150,30,32,32,300,100,50],
"accept_1": ["True","False","True","True","False","True","True"],
"amount_1" : [20,30,32,32,150,100,30]}
data = pd.DataFrame(data)
我想按主题和日期对数据进行分类,然后继续计算每个主题,如果Accept和accept_1都为真,则Amount和amount_1的总和。
这里的真/假不是布尔值,而是字符串。
我尝试了以下代码:
def PPP(tx_amount_1,tx_accepted_1,tx_amount,tx_accepted):
if tx_accepted_1 and tx_accepted == "True":
return tx_amount + tx_amount_1
example = data.groupby(["Subject","date"])
[["Accept","Amount","accept_1","amount_1"]].apply(lambda
x: PPP(x["amount_1"],x["accept_1"],x["Amount"],x["Accept"]))
我收到以下错误:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any()
or a.all().
IIUC,首先通过boolean indexing对匹配条件的行进行切片,然后执行GroupBy.sum
:
mask = data['Accept'].eq('True') & data['accept_1'].eq('True')
data[mask].groupby(['Subject', pd.to_datetime(data['date']).dt.normalize()]).sum()
输出:
Amount amount_1
Subject date
1 2020-05-01 150 20
3 2020-05-08 64 64
5 2020-05-07 150 130
如果你想要一个总计:
mask = data['Accept'].eq('True') & data['accept_1'].eq('True')
(data[mask]
.groupby(['Subject', pd.to_datetime(data['date']).dt.normalize()])
.sum().sum(axis=1)
.reset_index(name='Total')
)
输出:
Subject date Total
0 1 2020-05-01 170
1 3 2020-05-08 128
2 5 2020-05-07 280
mask = data['Accept'].eq('True') & data['accept_1'].eq('True')
cols = ['Amount', 'amount_1']
(data
.assign(**{c: data[c].where(mask, 0) for c in cols})
.groupby(['Subject', pd.to_datetime(data['date']).dt.normalize()])
.sum()
#.sum(axis=1).reset_index(name='Total') # uncomment for grand-total
)
输出:
Amount amount_1
Subject date
1 2020-05-01 150 20
2 2020-05-03 0 0
3 2020-05-08 64 64
4 2020-05-06 0 0
5 2020-05-07 150 130
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.