熊猫：在变量列表上使用条件df [variable] .isnull创建新列

Question

I want to create a mask column where 1 indicates there is data in a set of other columns and 0 when there is a blank in the same set 我想创建一个掩码列，其中1表示其他一组列中有数据，而0表示同一组中有空白

 A  B   C   D   E   mask1
 0  13  2   45  96  1
 1  14  2   45  96  1
 2  15  9   1.  NaN 1
 3  16  9   1.0 NaN 1
 4  17  5   0.0 NaN 1
 5  18  6   1.0 967 1
 6  19  6   1.0 976 1
 7  20  9   1.0 294 1
 8  21  5   0.0 372 1
 9  13  5   NaN 170 0
10  62  5   NaN 100 0
11  22  20  NaN 170 0
12  13  NaN 0.0 996 0

I managed to do it using the following code: 我设法使用以下代码来做到这一点：

df2["mask1"] = np.where((df2['C'].isnull() | df2['D'].isnull()) , 0, 1)

Now I want to automate this for a larger dataframe with more variables, ie, I want to specify the variables I want to use for this mask. 现在，我想针对具有更多变量的较大数据框自动执行此操作，即，我想指定要用于此掩码的变量。 I was thinking to create a list of variables such as 我当时正在考虑创建一个变量列表，例如

var = [C, D, E]

which I could use to perform this operation, but am not sure how to apply the same code I came up with using this list. 我可以使用它来执行此操作，但不确定如何应用我使用此列表提供的相同代码。 for loop? for循环？

Answer 1

Select columns and apply isnull or notnull 选择列并应用isnull或notnull

cols = ['C', 'D', 'E']
df['mask1'] = df[cols].notnull().all(1).astype(int)

    A   B   C       D       E   mask1
0   0   13  2.0     45.0    96.0    1
1   1   14  2.0     45.0    96.0    1
2   2   15  9.0     1.0     NaN     0
3   3   16  9.0     1.0     NaN     0
4   4   17  5.0     0.0     NaN     0
5   5   18  6.0     1.0     967.0   1
6   6   19  6.0     1.0     976.0   1
7   7   20  9.0     1.0     294.0   1
8   8   21  5.0     0.0     372.0   1
9   9   13  5.0     NaN     170.0   0
10  10  62  5.0     NaN     100.0   0
11  11  22  20.0    NaN     170.0   0
12  12  13  NaN     0.0     996.0   0

熊猫：在变量列表上使用条件df [variable] .isnull创建新列

问题描述

1 个解决方案

解决方案1
3 2017-12-07 18:35:46

熊猫：在变量列表上使用条件df [variable] .isnull创建新列

问题描述

1 个解决方案

解决方案1 3 2017-12-07 18:35:46

解决方案1
3 2017-12-07 18:35:46