简体   繁体   English

在Pandas数据框中计算连续数量的Null值

[英]Counting a consecutive number of Null Values in a Pandas Dataframe

Let's say I have a Data frame with some null values. 假设我有一个带有一些空值的数据框。 For each row, how would I get a tally of the columns the nulls would belong to. 对于每一行,我如何统计空值所属的列。 For example, in row 2 of the dataframe shown below, how would I get it to print/return Column 'A' and 'B', where there are nulls? 例如,在下面显示的数据框的第2行中,我如何获取它以打印/返回空值的列“ A”和“ B”?

For greater context, I have a table with Billboard singles, and the scores they received for each week (76 weeks total, 76 columns), all as dataframe columns. 对于更大的上下文,我有一张带有Billboard单打的表格,以及他们每周收到的分数(总计76周,共76列),全部作为数据框列。 Some weeks have null values because the particular song didn't perform well enough, and I want to find those columns where df.isnull() is True, in the row for that particular song. 某些星期有空值,因为特定歌曲的表现不佳,我想在该特定歌曲的行中找到df.isnull()为True的那些列。

df = pd.DataFrame(np.random.randint(0,10,size=(10, 4)), columns=list('ABCD'))
df.ix[4, 'C'] = np.nan
df.ix[4, 'B'] = np.nan
df.ix[2, 'B'] = np.nan
df.ix[2, 'A'] = np.nan
df.ix[6,'D'] = np.nan
df.ix[6,'C'] = np.nan
df

You can use apply() method to loop through rows, and use the isnull() method to create a logical series to subset the index which is the column names in this case. 您可以使用apply()方法遍历各行,并使用isnull()方法创建逻辑序列以将索引(在这种情况下为列名)子集化。 This returns a list of column names where the value is null for each row: 这将返回一个列名列表,其中每一行的值为空:

import pandas as pd
df.apply(lambda row: row.index[row.isnull()].tolist(), axis = 1)

#0        []
#1        []
#2    [A, B]
#3        []
#4    [B, C]
#5        []
#6    [C, D]
#7        []
#8        []
#9        []
#dtype: object

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM