[英]How to check constraint of non-null column in python?
df1: df1:
ColumnName Nullable
0 name True
1 Desgn True
2 Emp_number False
3 Salary True
df2: df2:
name Desgn Emp_number Salary
0 krul 125796 45000
1 arnold lawyer 789632 25000
2 daisy engg 256498
3 alex 456985 65884
4 mandy arch 456258 36958
5 krul painter
6 perry 789632
7 timu lawyer
8 timy lawyer 789632 69822
9 daisy engg
10 daisy engg 256498 54869
How to check the number of missing Values in df2 for Null-able Columns (nullable == True), if non-nullable column has missing value raise error else replace with median or mode?如何检查 df2 中可空列的缺失值数量(可空 == 真),如果不可空列有缺失值引发错误,否则替换为中位数或众数?
for idx, row in df1.iterrows():
if not row["Nullable"]:
# Get all the rows in df2 which has that column as null
nulls = df2[df2[row["ColumnName"]].isnull()]
# No of rows that has the column null
print(len(nulls))
Without for loops:没有 for 循环:
import pandas as pd
from io import StringIO
df2 = pd.read_table(StringIO(""" name Desgn Emp_number Salary
0 krul nan 125796 45000
1 arnold lawyer 789632 25000
2 daisy engg 256498 nan
3 alex nan 456985 65884
4 mandy arch 456258 36958
5 krul painter nan nan
6 perry nan 789632 nan
7 timu lawyer nan nan
8 timy lawyer 789632 69822
9 daisy engg nan nan
10 daisy engg 256498 54869"""), sep='\s+')
df1 = pd.read_table(StringIO(""" ColumnName Nullable
0 name True
1 Desgn True
2 Emp_number False
3 Salary True"""), sep='\s+')
# Transpose switches dtype, so we need to know what they were originally
a = df2.T.loc[df1.loc[df1.Nullable==True, 'ColumnName']].T
a = a.astype(df2[a.columns].dtypes.to_dict())
# Replace with median
df2[a.columns] = a.fillna(a.median())
# If any null in non nullable, raise ValueError
non_nullable_has_null = df2.T.loc[df1.loc[df1.Nullable==False, 'ColumnName']].T.isnull().any().any()
if non_nullable_has_null:
raise ValueError('non nullable has a null')
You can create a new object and count the null values您可以创建一个新的 object 并计算 null 值
new_df = df2.replace(to_replace=[None, ''], value=pd.np.nan)
new_df.isnull().sum()
In [424]: df.isnull().sum()
Out[424]:
name 0
Desgn 3
Emp_number 3
Salary 5
dtype: int64
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.