简体   繁体   中英

For each row in a dataframe, return a list of columns that are NaN. But not all columns, only those in a given list

Looking at building out some logging of errors and trying to catch null values in specific columns.
Essentially, I want to go from a dataframe and list of columns, to then output a dataframe with a column containing which of those columns from the list are null for each row. Note, I will also be doing this for negative values etc.
Example:

columns_list = ['A','B','D']
Date A B C D
2022-01-01 1 22 1231 -121
2022-01-02 11 NaN NaN NaN
2022-01-03 NaN 52 12 0
2022-01-04 11 27 NaN 3434

The following code will give the following output but I want to be able to use columns_list to not have column C being returned in X:

df['X']= df.apply(lambda x: ','.join(x[x.isnull()].index), axis=1)
Date A B C D X
2022-01-02 11 NaN NaN NaN B,C,D
2022-01-03 NaN 52 12 0 A
2022-01-04 11 27 NaN 3434 C

Thanking you all in advance!

Just subset your columns:

df['X']= df[columns_list].apply(lambda x: ','.join(x[x.isnull()].index), axis=1)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM