简体   繁体   中英

How to delete columns in a dataframe based on conditions of another dataframe, Python3

I have 2 pandas dataframes.

>>> df1
        col1   col2
sec1      11     22
sec2      11     22
sec3      11     22


>>> df2
           col1     col2
sec1      False     False
sec2      False     True
sec3      False     False

If a column in df2 has at least one True , then the column with the same header in df1 should be removed.

So in this example, the expected output should be

>>> df1
        col1
sec1      11
sec2      11
sec3      11

Here is my Python code

import pandas as pd


df1 = pd.DataFrame({'cond1': [11, 11, 11], 'cond2': [22, 22, 22]}, index=['sec1', 'sec2', 'sec3'])
df2 = pd.DataFrame({'cond1': [False, False, False], 'cond2': [False, True, False]}, index=['sec1', 'sed2', 'sec3'])

ds_remove = df2.any(axis=0)

df1.drop(ds_remove, inplace=True)  # This line does not work.

Could you please help me?

Thanks

Let us try mask , if True return NaN then we just need dropna

out = df1.mask(df2).dropna(1)
Out[658]: 
      col1
sec1    11
sec2    11
sec3    11

To fix your code

df1 = df1.loc[:,~ds_remove]

with drop

df1.drop(ds_remove[ds_remove].index, inplace=True,axis=1) 
df1
Out[666]: 
      col1
sec1    11
sec2    11
sec3    11

Here is another way:

df1.loc[:,~df2.any()]

You are very close:)

I've set the true/false series as conds, so I can filter where true and make a list of the columns where this happens. Then in the drop statement, you need to add axis=1 to drop columns. (The default is 0, which will drop rows)

conds = df2.any(axis=0)
ds_remove = conds.loc[conds].index.to_list()
df1 = df1.drop(ds_remove, axis=1, inplace=True)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM