检查列是否在列表中，如果不是则删除并将值添加到新列

Question

I have a database like this:我有一个这样的数据库：

df = pd.DataFrame( { 
        "A" : [0,0,1,1,0,1] , 
        "B" : [1,0,0,1,1,0],
        "C" : [0,0,0,1,1,0],
        "D" : [1,1,0,0,0,1]} )

which looks like this:看起来像这样：

    A   B   C   D
0   0   1   0   1
1   0   0   0   1
2   1   0   0   0
3   1   1   1   0
4   0   1   1   0
5   1   0   0   1

I have a list of columns I wish to keep allowed_columns = ["A","B"] .我有一个我希望保留的列列表allowed_columns = ["A","B"] 。 This means we get rid of C and D .这意味着我们摆脱了C和D 。 However, when dropping the columns, if there is a value 1, I want to note that in a new column lost .但是，在删除列时，如果有值 1，我想注意在新列中lost 。 This is what I'm trying to achieve:这就是我想要实现的目标：

    A   B   lost    
0   0   1   1   
1   0   0   1   
2   1   0   0   
3   1   1   1   
4   0   1   1   
5   1   0   1

For ease of problem, we can assume that C and D cannot have value 1 simultaneously.为了解决问题，我们可以假设C和D不能同时具有值 1。 How can I achieve this?我怎样才能做到这一点？

Answer 1

Subset to the allowed columns, then take the max of everything you removed with df.columns.difference子集到允许的列，然后取你用df.columns.difference删除的所有内容的最大值

df = (df[allowed_columns]
       .assign(lost=df[df.columns.difference(allowed_columns)].max(1)))

Answer 2

Let us do让我们做

df['Lost']=df[['C','D']].max(1)
df=df.drop(['C','D'],axis=1)

Answer 3

`groupby`

d = dict.fromkeys({*df} - {*allowed_columns}, 'lost')
df.groupby(lambda x: d.get(x, x), axis=1).max()

   A  B  lost
0  0  1     1
1  0  0     1
2  1  0     0
3  1  1     1
4  0  1     1
5  1  0     1

Answer 4

You could use any :你可以使用any ：

c = df.columns.difference(allowed_columns)
df['lost'] = df[c].any(axis=1).view('i1')

print(df)

   A  B  C  D  lost
0  0  1  0  1     1
1  0  0  0  1     1
2  1  0  0  0     0
3  1  1  1  0     1
4  0  1  1  0     1
5  1  0  0  1     1

Answer 5

df['lost']=((df['C']==1)|(df['D']==1)).astype(int)
df.drop(['C','D'],axis=1,inplace=True)

You can use two booleans separated by OR to define the values in df['lost'] , I think it is also intuitive, because您可以使用由OR分隔的两个布尔值来定义df['lost']中的值，我认为这也很直观，因为

(df['C']==1)|(df['D']==1) will be True if you have 1 in either column C or column D; (df['C']==1)|(df['D']==1)如果在 C 列或 D 列中有1 ，则为True ； otherwise it will be False否则它将是False
astype(int) converts True to 1 and False to 0 astype(int)将True转换为1 ，将False转换为0

检查列是否在列表中，如果不是则删除并将值添加到新列

问题描述

5 个解决方案

解决方案1
3 已采纳 2020-04-23 14:22:40

解决方案2
2 2020-04-23 14:21:48

解决方案3
1 2020-04-23 14:25:13

`groupby`

解决方案4
1 2020-04-23 14:26:00

解决方案5
0 2020-04-23 14:29:06

检查列是否在列表中，如果不是则删除并将值添加到新列

问题描述

5 个解决方案

解决方案1 3 已采纳 2020-04-23 14:22:40

解决方案2 2 2020-04-23 14:21:48

解决方案3 1 2020-04-23 14:25:13

groupby

解决方案4 1 2020-04-23 14:26:00

解决方案5 0 2020-04-23 14:29:06

解决方案1
3 已采纳 2020-04-23 14:22:40

解决方案2
2 2020-04-23 14:21:48

解决方案3
1 2020-04-23 14:25:13

`groupby`

解决方案4
1 2020-04-23 14:26:00

解决方案5
0 2020-04-23 14:29:06