[英]Counting a consecutive number of Null Values in a Pandas Dataframe
Let's say I have a Data frame with some null values. 假设我有一个带有一些空值的数据框。 For each row, how would I get a tally of the columns the nulls would belong to. 对于每一行,我如何统计空值所属的列。 For example, in row 2 of the dataframe shown below, how would I get it to print/return Column 'A' and 'B', where there are nulls? 例如,在下面显示的数据框的第2行中,我如何获取它以打印/返回空值的列“ A”和“ B”?
For greater context, I have a table with Billboard singles, and the scores they received for each week (76 weeks total, 76 columns), all as dataframe columns. 对于更大的上下文,我有一张带有Billboard单打的表格,以及他们每周收到的分数(总计76周,共76列),全部作为数据框列。 Some weeks have null values because the particular song didn't perform well enough, and I want to find those columns where df.isnull() is True, in the row for that particular song. 某些星期有空值,因为特定歌曲的表现不佳,我想在该特定歌曲的行中找到df.isnull()为True的那些列。
df = pd.DataFrame(np.random.randint(0,10,size=(10, 4)), columns=list('ABCD'))
df.ix[4, 'C'] = np.nan
df.ix[4, 'B'] = np.nan
df.ix[2, 'B'] = np.nan
df.ix[2, 'A'] = np.nan
df.ix[6,'D'] = np.nan
df.ix[6,'C'] = np.nan
df
You can use apply()
method to loop through rows, and use the isnull()
method to create a logical series to subset the index which is the column names in this case. 您可以使用apply()
方法遍历各行,并使用isnull()
方法创建逻辑序列以将索引(在这种情况下为列名)子集化。 This returns a list of column names where the value is null for each row: 这将返回一个列名列表,其中每一行的值为空:
import pandas as pd
df.apply(lambda row: row.index[row.isnull()].tolist(), axis = 1)
#0 []
#1 []
#2 [A, B]
#3 []
#4 [B, C]
#5 []
#6 [C, D]
#7 []
#8 []
#9 []
#dtype: object
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.