简体   繁体   English

仅当一列的值更大/大于 0 时才对两列求和

[英]Sum two columns only if the values of one column is bigger/greater 0

I've got the following dataframe我有以下数据框

lst=[['01012021','',100],['01012021','','50'],['01022021',140,5],['01022021',160,12],['01032021','',20],['01032021',200,25]]
df1=pd.DataFrame(lst,columns=['Date','AuM','NNA'])

I am looking for a code which sums the columns AuM and NNA only if the values of column AuM contains a value.我正在寻找一个代码,它仅在 AuM 列的值包含一个值时才对 AuM 和 NNA 列求和。 The result is showed below:结果如下所示:

lst=[['01012021','',100,''],['01012021','','50',''],['01022021',140,5,145],['01022021',160,12,172],['01032021','',20,'']]
df2=pd.DataFrame(lst,columns=['Date','AuM','NNA','Sum'])

I assume you mean to include the last row too:我假设您的意思也包括最后一行:

df2 = (df1.assign(Sum=df1.loc[df1.AuM.ne(""), ["AuM", "NNA"]].sum(axis=1))
          .fillna(""))
print(df2)

Result:结果:

       Date  AuM  NNA    Sum
0  01012021       100       
1  01012021        50       
2  01022021  140    5  145.0
3  01022021  160   12  172.0
4  01032021        20       
5  01032021  200   25  225.0

It is not a good practice to use '' in place of NaN when you have numeric data.当您有数字数据时,使用''代替 NaN 不是一个好习惯。

That said, a generic solution to your issue would be to use sum with the skipna=False option:也就是说,解决您的问题的通用解决方案是将sumskipna=False选项一起使用:

df1['Sum'] = (df1[['AuM', 'NNA']] # you can use as many columns as you want
        .apply(pd.to_numeric, errors='coerce')  # convert to numeric
        .sum(1, skipna=False)                   # sum if all are non-NaN
        .fillna('')               # fill NaN with empty string (bad practice)
       )

output:输出:

       Date  AuM  NNA    Sum
0  01012021       100       
1  01012021        50       
2  01022021  140    5  145.0
3  01022021  160   12  172.0
4  01032021        20       
5  01032021  200   25  225.0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM