简体   繁体   English

Pandas:计算列中的空值

[英]Pandas: Counting empty values in columns

I have a pandas dataframe containing some info about purchases.我有一个 pandas dataframe 包含一些有关购买的信息。 It includes columns like "purchaseID", "purchaseDate", and "purchaseAmount".它包括“purchaseID”、“purchaseDate”和“purchaseAmount”等列。 I want to know the number of missing values in each column, and different columns contain different types of datatypes like strings, numeric, booleans, etc. I tried something like this:我想知道每列中缺失值的数量,不同的列包含不同类型的数据类型,如字符串、数字、布尔值等。我尝试过这样的事情:

import json
import pandas

# the variable 'data' is my pandas data frame which was read from a json
with open('purchases.json') as f:
    data = pd.DataFrame(json.loads(line) for line in f)

print(data.isnull().sum())
print(data.isna().sum())

However, both isnull and isna are showing that there are no null values in any columns, which is not the case.但是,isnull 和 isna 都显示在任何列中都没有 null 值,但事实并非如此。

When I tried something like this:当我尝试这样的事情时:

for col in data.columns: 
    print((data[col].values == '').sum())

it works for some columns but not for columns that contain numeric or boolean data.它适用于某些列,但不适用于包含数字或 boolean 数据的列。 Is there a way for me to find the empty values in all the columns?有没有办法让我在所有列中找到空值?

Thanks!谢谢!

Example printout using print of couple lines of data使用打印几行数据的示例打印输出

purchaseID purchaseDate purchaseAmount merchantName
1234       2019-01-01   500.0          Walmart
2345       2019-01-03
           2019-01-02   25.1           BP 

Try using pd.read_json .尝试使用pd.read_json The problem could be your data frame with one row being the json file.问题可能是您的数据框,其中一行是 json 文件。

data = pd.read_json(r'purchases.json')
print(data.isnull().sum())
print(data.isna().sum().sum())

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM