I have a pandas dataframe containing some info about purchases. It includes columns like "purchaseID", "purchaseDate", and "purchaseAmount". I want to know the number of missing values in each column, and different columns contain different types of datatypes like strings, numeric, booleans, etc. I tried something like this:
import json
import pandas
# the variable 'data' is my pandas data frame which was read from a json
with open('purchases.json') as f:
data = pd.DataFrame(json.loads(line) for line in f)
print(data.isnull().sum())
print(data.isna().sum())
However, both isnull and isna are showing that there are no null values in any columns, which is not the case.
When I tried something like this:
for col in data.columns:
print((data[col].values == '').sum())
it works for some columns but not for columns that contain numeric or boolean data. Is there a way for me to find the empty values in all the columns?
Thanks!
Example printout using print of couple lines of data
purchaseID purchaseDate purchaseAmount merchantName
1234 2019-01-01 500.0 Walmart
2345 2019-01-03
2019-01-02 25.1 BP
Try using pd.read_json
. The problem could be your data frame with one row being the json file.
data = pd.read_json(r'purchases.json')
print(data.isnull().sum())
print(data.isna().sum().sum())
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.