在Pandas数据框中计算连续数量的Null值

Question

Let's say I have a Data frame with some null values. 假设我有一个带有一些空值的数据框。 For each row, how would I get a tally of the columns the nulls would belong to. 对于每一行，我如何统计空值所属的列。 For example, in row 2 of the dataframe shown below, how would I get it to print/return Column 'A' and 'B', where there are nulls? 例如，在下面显示的数据框的第2行中，我如何获取它以打印/返回空值的列“ A”和“ B”？

For greater context, I have a table with Billboard singles, and the scores they received for each week (76 weeks total, 76 columns), all as dataframe columns. 对于更大的上下文，我有一张带有Billboard单打的表格，以及他们每周收到的分数（总计76周，共76列），全部作为数据框列。 Some weeks have null values because the particular song didn't perform well enough, and I want to find those columns where df.isnull() is True, in the row for that particular song. 某些星期有空值，因为特定歌曲的表现不佳，我想在该特定歌曲的行中找到df.isnull（）为True的那些列。

df = pd.DataFrame(np.random.randint(0,10,size=(10, 4)), columns=list('ABCD'))
df.ix[4, 'C'] = np.nan
df.ix[4, 'B'] = np.nan
df.ix[2, 'B'] = np.nan
df.ix[2, 'A'] = np.nan
df.ix[6,'D'] = np.nan
df.ix[6,'C'] = np.nan
df

Answer 1

You can use apply() method to loop through rows, and use the isnull() method to create a logical series to subset the index which is the column names in this case. 您可以使用apply()方法遍历各行，并使用isnull()方法创建逻辑序列以将索引（在这种情况下为列名）子集化。 This returns a list of column names where the value is null for each row: 这将返回一个列名列表，其中每一行的值为空：

import pandas as pd
df.apply(lambda row: row.index[row.isnull()].tolist(), axis = 1)

#0        []
#1        []
#2    [A, B]
#3        []
#4    [B, C]
#5        []
#6    [C, D]
#7        []
#8        []
#9        []
#dtype: object

在Pandas数据框中计算连续数量的Null值

问题描述

1 个解决方案

解决方案1
2 已采纳 2016-10-31 01:58:18

在Pandas数据框中计算连续数量的Null值

问题描述

1 个解决方案

解决方案1 2 已采纳 2016-10-31 01:58:18

解决方案1
2 已采纳 2016-10-31 01:58:18