打印 null 值的部分

Question

I am working with titanic dataset.我正在使用泰坦尼克号数据集。 I wonder how to show portion of null value from a train set.我想知道如何显示火车组中 null 值的一部分。

Here is my code: `这是我的代码：`

train_count_of_missval_by_col = (train.isnull().sum())
print('----- all columns along with count of missing value')
print(train_count_of_missval_by_col)
print('----only columns which has missing values----')
print(train_count_of_missval_by_col[train_count_of_missval_by_col>0])
print('----only columns which has missing data to total observations----')
print(train_count_of_missval_by_col[train_count_of_missval_by_col>0]/train.shape[])`

Unfortunately, the last line of the code generate error.不幸的是，代码的最后一行产生了错误。 What to add / edit on the lastline so the code will work?在最后一行添加/编辑什么以便代码可以工作？

Answer 1

I am not sure if there is a specific operation for this.我不确定是否有针对此的特定操作。 info() shows you the raw # and tells you the total rows but there are no parameters for the %. info()向您显示原始 # 并告诉您总行数，但 % 没有参数。 Also .info() returns as a None type object, so you can't access any data from that object. .info()也返回为None类型 object，因此您无法访问该 object 中的任何数据。

I would suggest looping through the column and returning the # null divided by total rows with df[col].isnull().sum() / df.shape[0] * 100 and printing out the output in a formatted string as such:我建议遍历该列并返回 # null 除以df[col].isnull().sum() / df.shape[0] * 100的总行数，然后以格式化字符串打印出 output，如下所示：

d = {'Col1': [np.nan, 6, np.nan, 2, np.nan],
     'Col2': [np.nan, 3, 5, np.nan, 9],
     'Col3': [2, 1, 8, np.nan, 9]}
df = pd.DataFrame(d)
for col in df.columns:
    print(col, f'{df[col].isnull().sum() / df.shape[0] * 100} % NULL')

Col1 60.0 % NULL
Col2 40.0 % NULL
Col3 20.0 % NULL

打印 null 值的部分

问题描述

1 个解决方案

解决方案1
0 已采纳 2020-12-06 07:56:41

打印 null 值的部分

问题描述

1 个解决方案

解决方案1 0 已采纳 2020-12-06 07:56:41

解决方案1
0 已采纳 2020-12-06 07:56:41