Pandas 数据框 - 将前一列中与特定条件匹配的所有值相加并将其添加到新列中

Question

I'm probably missing something, but I was not able to find a solution for this.我可能遗漏了一些东西，但我找不到解决方案。 Is there a way in python to add values to a new column which satisfy a certain condition. python中有没有办法将值添加到满足特定条件的新列中。 In Excel I would apply the following formula in the new column and paste it below在 Excel 中，我将在新列中应用以下公式并将其粘贴到下方

=SUMIF(A1:C1, ">0")

val1值1	val2值2	val3 val3	output输出
0.5 0.5	0.7 0.7	-0.9 -0.9	1.2 1.2
0.3 0.3	-0.7 -0.7		0.3 0.3
-0.5 -0.5	-0.7 -0.7	-0.9 -0.9	0 0

Also in my extracts, there are a few blank values.同样在我的摘录中，还有一些空白值。 Can you please help me understand what code should be written for this?你能帮我理解应该为此编写什么代码吗？

df['total'] = df[['A','B']].sum(axis=1).where(df['A'] > 0, 0)

I came across the above code, but it checks only one condition.我遇到了上面的代码，但它只检查一个条件。 What I need is a sum of all of those columns which match the given condition.我需要的是与给定条件匹配的所有列的总和。

Thanks!谢谢！

Answer 1

pandas can handle that quite out of the box, like that: pandas可以开箱即pandas处理它，就像这样：

import pandas as pd
df = pd.DataFrame([[0.5,.7,-.9],[0.3,-.7,None],[-0.5,-.7,-.9]], columns=['val1','val2','val3'])

df['output'] = df[df>0].sum(axis=1)

Answer 2

Use DataFrame.clip before sum :在sum之前使用DataFrame.clip ：

df['total'] = df[['val1','val2','val3']].clip(lower=0).sum(axis=1)

#solution by Nk03 from comments
cols = ['val1','val2','val3']
df['total'] = df[cols].mask(df[cols]<0).sum(axis=1)

EDIT: For test another mask by another columns convert them to numpy array:编辑：为了测试另一个列的另一个掩码，将它们转换为 numpy 数组：

df['total'] = df.loc[:, "D":"F"].mask(df.loc[:, "A":"C"].to_numpy() == 'Y', 0).sum(axis=1)

Answer 3

Another way, somewhat similar to SUMIF :另一种方式，有点类似于SUMIF ：

# this is the "IF"
is_positive = df.loc[:, "val1": "val3"] > 0

# this is selecting the parts where condition holds & sums
df["output"] = df.loc[:, "val1": "val3"][is_positive].sum(axis=1)

where axis=1 in last line is to sum along rows,最后一行中的axis=1是沿行求和，

to get要得到

>>> df

   val1  val2  val3  output
0   0.5   0.7  -0.9     1.2
1   0.3  -0.7   NaN     0.3
2  -0.5  -0.7  -0.9     0.0

Answer 4

You can do it in the following way:您可以通过以下方式进行操作：

df["total"] = df.apply(lambda x: sum(x), axis=1).where((df['A'] > 0) & (df['B'] > 0) & (another_condition) & (another_condition), 0)

Note the code will take sum across all columns at once.请注意，代码将一次性计算所有列的总和。
For taking sum of specific columns you can do the following:要计算特定列的总和，您可以执行以下操作：

df['total'] = df[['A','B','C','D','E']].sum(axis=1).where((df['A'] > 0) & (df['B'] > 0) & (another_condition) & (another_condition), 0)

Pandas 数据框 - 将前一列中与特定条件匹配的所有值相加并将其添加到新列中

问题描述

4 个解决方案

解决方案1
3 2021-06-23 12:08:55

解决方案2
2 2021-06-23 11:59:41

解决方案3
2 已采纳 2021-06-23 12:05:56

解决方案4
1 2021-06-23 12:15:29

Pandas 数据框 - 将前一列中与特定条件匹配的所有值相加并将其添加到新列中

问题描述

4 个解决方案

解决方案1 3 2021-06-23 12:08:55

解决方案2 2 2021-06-23 11:59:41

解决方案3 2 已采纳 2021-06-23 12:05:56

解决方案4 1 2021-06-23 12:15:29

解决方案1
3 2021-06-23 12:08:55

解决方案2
2 2021-06-23 11:59:41

解决方案3
2 已采纳 2021-06-23 12:05:56

解决方案4
1 2021-06-23 12:15:29