简体   繁体   English

Pandas 数据框 - 将前一列中与特定条件匹配的所有值相加并将其添加到新列中

[英]Pandas Data Frame - Sum all the values in a previous column which match a specific condition and add it to a new column

I'm probably missing something, but I was not able to find a solution for this.我可能遗漏了一些东西,但我找不到解决方案。 Is there a way in python to add values to a new column which satisfy a certain condition. python中有没有办法将值添加到满足特定条件的新列中。 In Excel I would apply the following formula in the new column and paste it below在 Excel 中,我将在新列中应用以下公式并将其粘贴到下方

=SUMIF(A1:C1, ">0")
val1值1 val2值2 val3 val3 output输出
0.5 0.5 0.7 0.7 -0.9 -0.9 1.2 1.2
0.3 0.3 -0.7 -0.7 0.3 0.3
-0.5 -0.5 -0.7 -0.7 -0.9 -0.9 0 0

Also in my extracts, there are a few blank values.同样在我的摘录中,还有一些空白值。 Can you please help me understand what code should be written for this?你能帮我理解应该为此编写什么代码吗?

df['total'] = df[['A','B']].sum(axis=1).where(df['A'] > 0, 0)

I came across the above code, but it checks only one condition.我遇到了上面的代码,但它只检查一个条件。 What I need is a sum of all of those columns which match the given condition.我需要的是与给定条件匹配的所有列的总和。

Thanks!谢谢!

pandas can handle that quite out of the box, like that: pandas可以开箱即pandas处理它,就像这样:

import pandas as pd
df = pd.DataFrame([[0.5,.7,-.9],[0.3,-.7,None],[-0.5,-.7,-.9]], columns=['val1','val2','val3'])

df['output'] = df[df>0].sum(axis=1)

Use DataFrame.clip before sum :sum之前使用DataFrame.clip

df['total'] = df[['val1','val2','val3']].clip(lower=0).sum(axis=1)

#solution by Nk03 from comments
cols = ['val1','val2','val3']
df['total'] = df[cols].mask(df[cols]<0).sum(axis=1)

EDIT: For test another mask by another columns convert them to numpy array:编辑:为了测试另一个列的另一个掩码,将它们转换为 numpy 数组:

df['total'] = df.loc[:, "D":"F"].mask(df.loc[:, "A":"C"].to_numpy() == 'Y', 0).sum(axis=1)

Another way, somewhat similar to SUMIF :另一种方式,有点类似于SUMIF

# this is the "IF"
is_positive = df.loc[:, "val1": "val3"] > 0

# this is selecting the parts where condition holds & sums
df["output"] = df.loc[:, "val1": "val3"][is_positive].sum(axis=1)

where axis=1 in last line is to sum along rows,最后一行中的axis=1是沿行求和,

to get要得到

>>> df

   val1  val2  val3  output
0   0.5   0.7  -0.9     1.2
1   0.3  -0.7   NaN     0.3
2  -0.5  -0.7  -0.9     0.0

You can do it in the following way:您可以通过以下方式进行操作:

df["total"] = df.apply(lambda x: sum(x), axis=1).where((df['A'] > 0) & (df['B'] > 0) & (another_condition) & (another_condition), 0)

Note the code will take sum across all columns at once.请注意,代码将一次性计算所有列的总和。
For taking sum of specific columns you can do the following:要计算特定列的总和,您可以执行以下操作:

df['total'] = df[['A','B','C','D','E']].sum(axis=1).where((df['A'] > 0) & (df['B'] > 0) & (another_condition) & (another_condition), 0)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM