简体   繁体   中英

Dropping column from one dataframe based on column value of second dataframe in pandas

I have 2 dataframes df1 and df2, both consisting of 8 columns each as seen below :

**df1**
╔══════════════════════════════════════════════════════════╗
║John ║ Mark ║ Jane ║ Natasha ║ Oliver ║ Tony ║ Judd ║ Ron ║
╚══════════════════════════════════════════════════════════╝


**df2**
╔══════════════════════════════════════════════════╗
║True ║True ║False ║True ║False ║False ║False ║True║
╚══════════════════════════════════════════════════╝

df1 has columns that are names of different people while df2 has column names that are boolean values. What I want to do is drop all columns in df1 that have a corresponding value of False in df2 . So the resulting output should look like this :

**output**
╔════════════════════════════╗
║John ║ Mark ║ Natasha ║ Ron ║
╚════════════════════════════╝

I am reading both the dataframes from csv files.

Any and all help would be appreciated.

Note : The actual dataframes have 500 columns each. Used 8 as an example for visualization purposes as well as to show that the dataframes have equal number of columns

Thanks in advance

You can, using basic indexing. However, when you parse your df2 , the column names have duplicates and are altered, so it requires a bit of cleaning.

Setup

names = ['John', 'Mark', 'Jane', 'Natasha', 'Oliver', 'Tony', 'Judd', 'Ron']
cols = ['TRUE', 'TRUE.1', 'FALSE', 'FALSE.1', 'TRUE.2', 'FALSE.2', 'FALSE.3', 'TRUE.3']

df1 = pd.DataFrame(columns=names)
df2 = pd.DataFrame(columns=cols)

df1.loc[:, df2.columns.str.contains('TRUE')]

Empty DataFrame
Columns: [John, Mark, Oliver, Ron]
Index: []

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM