I have a dataset with more than 6k data. I want to know how to count missing data and non-numeric data(error) simultaneously, and then using a histogram to plot the occurrence.
I use this code to find out the missing data and error data but I can only filter one subset each time. I don't know how to sum them up. The data type of a, b, and c is the object. For Id and d are the int and float.
How can this be done programmatically? And then using the histogram to show the occurrence.
df[pd.to_numeric(df['a'], errors='coerce').isnull()]
df = pd.DataFrame({'Id':[1, 2, 3, 4, 5], 'a': [1, 2, good, 'bad', NaN], 'b': [0.1, worse, NaN, better, 0.5], 'c': ['2.5', 'best', '6.5', 'NaN', '10.5'], 'd': ['10', '20', '30', '40', '50']})
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.