简体   繁体   English

检查列是否在列表中,如果不是则删除并将值添加到新列

[英]Check if column is in a list, remove if not and add value to a new column

I have a database like this:我有一个这样的数据库:

df = pd.DataFrame( { 
        "A" : [0,0,1,1,0,1] , 
        "B" : [1,0,0,1,1,0],
        "C" : [0,0,0,1,1,0],
        "D" : [1,1,0,0,0,1]} )

which looks like this:看起来像这样:

    A   B   C   D
0   0   1   0   1
1   0   0   0   1
2   1   0   0   0
3   1   1   1   0
4   0   1   1   0
5   1   0   0   1

I have a list of columns I wish to keep allowed_columns = ["A","B"] .我有一个我希望保留的列列表allowed_columns = ["A","B"] This means we get rid of C and D .这意味着我们摆脱了CD However, when dropping the columns, if there is a value 1, I want to note that in a new column lost .但是,在删除列时,如果有值 1,我想注意在新列中lost This is what I'm trying to achieve:这就是我想要实现的目标:

    A   B   lost    
0   0   1   1   
1   0   0   1   
2   1   0   0   
3   1   1   1   
4   0   1   1   
5   1   0   1   

For ease of problem, we can assume that C and D cannot have value 1 simultaneously.为了解决问题,我们可以假设CD不能同时具有值 1。 How can I achieve this?我怎样才能做到这一点?

Subset to the allowed columns, then take the max of everything you removed with df.columns.difference子集到允许的列,然后取你用df.columns.difference删除的所有内容的最大值

df = (df[allowed_columns]
       .assign(lost=df[df.columns.difference(allowed_columns)].max(1)))

Let us do让我们做

df['Lost']=df[['C','D']].max(1)
df=df.drop(['C','D'],axis=1)

groupby

d = dict.fromkeys({*df} - {*allowed_columns}, 'lost')
df.groupby(lambda x: d.get(x, x), axis=1).max()

   A  B  lost
0  0  1     1
1  0  0     1
2  1  0     0
3  1  1     1
4  0  1     1
5  1  0     1

You could use any :你可以使用any

c = df.columns.difference(allowed_columns)
df['lost'] = df[c].any(axis=1).view('i1')

print(df)

   A  B  C  D  lost
0  0  1  0  1     1
1  0  0  0  1     1
2  1  0  0  0     0
3  1  1  1  0     1
4  0  1  1  0     1
5  1  0  0  1     1
df['lost']=((df['C']==1)|(df['D']==1)).astype(int)
df.drop(['C','D'],axis=1,inplace=True)

You can use two booleans separated by OR to define the values in df['lost'] , I think it is also intuitive, because您可以使用由OR分隔的两个布尔值来定义df['lost']中的值,我认为这也很直观,因为

  • (df['C']==1)|(df['D']==1) will be True if you have 1 in either column C or column D; (df['C']==1)|(df['D']==1)如果在 C 列或 D 列中有1 ,则为True otherwise it will be False否则它将是False

  • astype(int) converts True to 1 and False to 0 astype(int)True转换为1 ,将False转换为0

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 检查列值是否在列表中并报告到新列 - Check if a column value is in a list and report to a new column 在熊猫中,检查主字符串是否包含列表中的字符串,是否确实从主字符串中删除了子字符串并将其添加到新列中 - In pandas, check if a master string contains a string from a list, if it does remove the substring from the master string and add it to a new column 如果值相同,如何检查 3 列是否相同并添加一个具有该值的新列? - How to check if 3 columns are same and add a new column with the value if the values are same? 检查行值是否与列表中的值之一匹配,然后在新列中分配 1/0 - Check if row value match one of values in a list, then assign 1/0 in a new column Pandas:检查列中是否存在值,创建一个新列,存在则加1,如果不存在加0 - Pandas: Check if a value exist in a column, create a new column, exist add 1 if not add 0 熊猫将列转换为列表并添加新列 - panda convert column to list and add new column Pandas Dataframe 检查列值是否在列列表中 - Pandas Dataframe Check if column value is in column list 检查一列中的值是否在另一列的列表中 - Check if a value in one column is in a list in another column 如何根据外部列表的值在 dataframe 中添加新列? - How to add a new column in a dataframe based on the value of external list? 对于列中的相似值,添加新列频率 - For similar value in column add new column frequence
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM