简体   繁体   English

计算 pandas DataFrame 中的所有 NaN

[英]Count all NaNs in a pandas DataFrame

I'm trying to count NaN element (data type class 'numpy.float64')in pandas series to know how many are there which data type is class 'pandas.core.series.Series'我正在尝试计算 pandas 系列中的 NaN 元素(数据类型 class 'numpy.float64')以了解其中有多少数据类型为 class 'pandas.core.series.Series'

This is for count null value in pandas series这是针对 pandas 系列中的计数 null 值

import pandas as pd
oc=pd.read_csv(csv_file)
oc.count("NaN")

my expected output of oc,count("NaN") to be 7 but it show 'Level NaN must be same as name (None)'我预计oc,count("NaN")的 output 为 7,但它显示'Level NaN must be same as name (None)'

The argument to count isn't what you want counted (it's actually the axis name or index). count的参数不是你想要计算的(它实际上是轴名称或索引)。

You're looking for df.isna().values.sum() (to count NaNs across the entire DataFrame), or len(df) - df['column'].count() (to count NaNs in a specific column).您正在寻找df.isna().values.sum() (计算整个 DataFrame 中的 NaN),或len(df) - df['column'].count() (计算特定列中的 NaN) ).

You can use either of the following if your Series.dtype is float64 :如果您的Series.dtypefloat64 ,您可以使用以下任一方法:

oc.isin([np.nan]).sum()
oc.isna().sum()

If your Series is of mixed data-type you can use the following:如果您的Series是混合数据类型,您可以使用以下内容:

oc.isin([np.nan, 'NaN']).sum()

oc.size : returns total element counts of dataframe including NaN oc.size :返回 dataframe 的总元素数,包括NaN
oc.count().sum() : return total element counts of dataframe excluding NaN oc.count().sum() :返回 dataframe 的总元素计数,不包括NaN

Therefore, another way to count number of NaN in dataframe is doing subtraction on them:因此,计算 dataframe 中NaN数量的另一种方法是对它们进行减法:

NaN_count = oc.size - oc.count().sum()

If your dataframe looks like this;如果你的 dataframe 看起来像这样;

aa = pd.DataFrame(np.array([[1,2,np.nan],[3,np.nan,5],[8,7,6],
                 [np.nan,np.nan,0]]), columns=['a','b','c'])
    a    b    c
0  1.0  2.0  NaN
1  3.0  NaN  5.0
2  8.0  7.0  6.0
3  NaN  NaN  0.0

To count 'nan' by cols, you can try this要按 cols 计算'nan',你可以试试这个

aa.isnull().sum()
a    1
b    2
c    1

For total count of nan对于 nan 的总数

aa.isnull().values.sum()
4

Just for fun, you can do either只是为了好玩,你可以做任何一个

df.isnull().sum().sum()

or或者

len(df)*len(df.columns) - len(df.stack())

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM