如何基于组合1和许多列在Pandas DataFrame中创建新列

Question

I have a data set that looks like this: 我有一个数据集，看起来像这样：

   Cond  Column_A  Column_B  Column_C  Cumulative_Count
0     1     -0.60     -0.12     -0.17                 1
1     0      0.30      0.70      0.98                 0
2     1     -0.45     -0.71     -0.99                 2
3     1      0.60      0.12      0.17                 1
4     0      0.20      0.80      0.60                 0
5     1      0.70      0.14      0.20                 1

I would like to create a column Cumulative_Count that counts occurrence of an event based on multiple conditions such as: 我想创建一个Cumulative_Count列，该列基于多个条件对事件的发生进行计数，例如：

1) If Cond=1 and (Column_A<0.5 or Column B>0.5) then Cumulative_Count=Cumulative_Count+1 1）如果Cond = 1并且（Column_A <0.5或Column B> 0.5），则Cumulative_Count = Cumulative_Count + 1

2) If Cond=1 and (Column_B<0.5 or Column B>0.5) then Cumulative_Count=Cumulative_Count+1 2）如果Cond = 1并且（Column_B <0.5或Column B> 0.5），则Cumulative_Count = Cumulative_Count + 1

3) If Cond=1 and (Column_C<0.5 or Column C>0.5) then Cumulative_Count=Cumulative_Count+1 3）如果Cond = 1并且（Column_C <0.5或Column C> 0.5），则Cumulative_Count = Cumulative_Count + 1

I would like to use NumPy arrays to perform it because my dataset is very large. 我想使用NumPy数组来执行它，因为我的数据集非常大。 I tried using below code, it is not throwing error, but the result is not correct. 我尝试使用下面的代码，它不会引发错误，但是结果不正确。 And, I need to use it for all columns if possible because I have 50+ columns. 而且，如果可能的话，我需要对所有列都使用它，因为我有50多个列。

df['Cum_Count']=0
df['Cum_Count']=np.where((df['Cond']>0 & ((df['Column_A']<-0.5) | (df['Column_A']>0.5))), df['Cum_Count']+1, df['Cum_Count'])

Answer 1

Doing with 一起做

cond1=df.filter(like='Column')
cond2=df.Cond

df['count']=(cond1.gt(0.5)|cond1.lt(-0.5)).__and__(cond2,axis=0).sum(1)

如何基于组合1和许多列在Pandas DataFrame中创建新列

问题描述

1 个解决方案

解决方案1
1 已采纳 2019-01-05 02:45:06

如何基于组合1和许多列在Pandas DataFrame中创建新列

问题描述

1 个解决方案

解决方案1 1 已采纳 2019-01-05 02:45:06

解决方案1
1 已采纳 2019-01-05 02:45:06