如何从数据框中删除具有空值和分类变量的行？

Question

I am trying to drop the rows with null values and categorical variables from the dataframe that I imported from Excel.我试图从我从 Excel 导入的数据框中删除具有空值和分类变量的行。 I've tried many other functions and many different ways to do so as well but I am not able to drop them, at least not all.我已经尝试了许多其他功能和许多不同的方法来这样做，但我无法删除它们，至少不是全部。

There are around 185000 rows with 6 columns.大约有 185000 行和 6 列。 What I was trying to do is using for loop to go through the entire rows and drop the rows if there is a null value or categorical variable on the column "Order ID".我试图做的是使用 for 循环遍历整行并在“订单 ID”列上有空值或分类变量时删除行。

This is one of the codes I've tried:这是我尝试过的代码之一：

f = 0

value = merged_file.at[f, 'Order ID']
for value in merged_file:
    if value is None:
        merged_file.drop(merged_file.index[f])
        merged_file.reset_index(inplace=True, drop=True)
        f+=1
        continue
    elif value == 'Order ID':
        merged_file.drop(merged_file.index[f])
        merged_file.reset_index(drop=True, inplace=True)
        f+=1
        continue
    elif f==186845:
        break
    else:
        f+=1
        continue

I would be grateful if correct me what I am doing wrong and please let me know if there is a better way to specify and drop the rows or columns with null values and categorical variables.如果纠正我做错了什么，我将不胜感激，如果有更好的方法来指定和删除具有空值和分类变量的行或列，请告诉我。

Thank you.谢谢你。

Answer 1

So, it seems you're using pandas even if the code does not look really pythonic.因此，即使代码看起来不是真正的 Pythonic，您似乎也在使用 Pandas。 Anyway, I would suggest to not iterate though each row of the dataframe, in pandas rows containing nan can be dropped using dropna :无论如何，我建议不要遍历数据帧的每一行，在包含 nan 的熊猫行中可以使用dropna 删除：

 merged_file.dropna(subset=['Order ID'],inplace=True)

To remove the rows containing categorical variables instead you can use numpy isreal .要删除包含分类变量的行，您可以使用 numpy isreal 。 Apply simply apply the function isreal to all rows, labelling as False all rows which do not contain numerical values. Apply 简单地将函数 isreal 应用于所有行，将所有不包含数值的行标记为 False。

import numpy as np
merged_file = merged_file[merged_file['Order ID'].apply(lambda x: np.isreal(x))]

如何从数据框中删除具有空值和分类变量的行？

问题描述

1 个解决方案

解决方案1
0 2020-03-31 00:44:27

如何从数据框中删除具有空值和分类变量的行？

问题描述

1 个解决方案

解决方案1 0 2020-03-31 00:44:27

解决方案1
0 2020-03-31 00:44:27