简体   繁体   中英

Counting a consecutive number of Null Values in a Pandas Dataframe

Let's say I have a Data frame with some null values. For each row, how would I get a tally of the columns the nulls would belong to. For example, in row 2 of the dataframe shown below, how would I get it to print/return Column 'A' and 'B', where there are nulls?

For greater context, I have a table with Billboard singles, and the scores they received for each week (76 weeks total, 76 columns), all as dataframe columns. Some weeks have null values because the particular song didn't perform well enough, and I want to find those columns where df.isnull() is True, in the row for that particular song.

df = pd.DataFrame(np.random.randint(0,10,size=(10, 4)), columns=list('ABCD'))
df.ix[4, 'C'] = np.nan
df.ix[4, 'B'] = np.nan
df.ix[2, 'B'] = np.nan
df.ix[2, 'A'] = np.nan
df.ix[6,'D'] = np.nan
df.ix[6,'C'] = np.nan
df

You can use apply() method to loop through rows, and use the isnull() method to create a logical series to subset the index which is the column names in this case. This returns a list of column names where the value is null for each row:

import pandas as pd
df.apply(lambda row: row.index[row.isnull()].tolist(), axis = 1)

#0        []
#1        []
#2    [A, B]
#3        []
#4    [B, C]
#5        []
#6    [C, D]
#7        []
#8        []
#9        []
#dtype: object

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM