如何修改循环以便从 Pandas Python 中的 DataFrame 中的列中的值中获取 NaN 值？

Question

I have sample of my code in Python like below:我在 Python 中有我的代码示例，如下所示：

... ...

for col in df.columns.tolist():
    if val in df[f"{col}"].values:
       if val.isna():
          my_list.append(col)

So, if some column from my DataFrame contains NaN value add name of this column to "my_list".因此，如果我的 DataFrame 中的某些列包含 NaN 值，则将此列的名称添加到“my_list”。

I know that in my DF are columns with NaN values, but my code generate empty "my_list", probably the error is in line: if val.isna(): , how can I modify that?我知道在我的 DF 中是具有 NaN 值的列，但我的代码生成空的“my_list”，可能错误在行： if val.isna(): ，我该如何修改它？ How can I "tell" Python to take NaN values from columns?如何“告诉” Python 从列中获取 NaN 值？

Answer 1

Just use a if col statement like this只需使用这样的 if col 语句

for col in df.columns.tolist():
    if val in df[f"{col}"].values:
       if col == False:
          my_list.append(col)

I am not giving you the best way of doing it, just fixing your little list loop我没有给你最好的方法，只是修复你的小列表循环

Answer 2

By iterating over the values in the column, adding the column name to my_list and then breaking you get this:通过迭代列中的值，将列名添加到 my_list 然后打破你得到这个：

my_list = ['col1','col3']

My code:我的代码：

import pandas as pd
from numpy import NaN

df = pd.DataFrame(data={
    "col1":[10,2.5,NaN],
    "col2":[10,2.5,3.5],
    "col3":[5,NaN,1]})
my_list = []

for col in df.columns:
    for val in df[col].values:
        if pd.isna(val):
            my_list.append(col)
            break
print(f"{my_list=}")

Answer 3

You can fix your code with changes that @Orange mentioned.您可以使用@Orange 提到的更改来修复您的代码。 I'm just adding this as an alternative.我只是将其添加为替代方案。 When working with data you want to allow the data base/data analysis software to do the heavy lifting.处理数据时，您希望允许数据库/数据分析软件完成繁重的工作。 Looping over a cursor is something you should try to avoid as best as you can.在 cursor 上循环是您应该尽量避免的事情。

The code you have can be changed to:您拥有的代码可以更改为：

for col in df.columns:
    if df[col].hasnans:
        my_list.append(col)

The code below functionally does the same thing:下面的代码在功能上做同样的事情：

df.columns[[df[col].hasnans for col in df.columns]].to_list()

The code below calculates hasnans using isna and sum .下面的代码使用isna和sum计算hasnans 。

df.columns[df.isna().sum() > 0].to_list()

如何修改循环以便从 Pandas Python 中的 DataFrame 中的列中的值中获取 NaN 值？

问题描述

3 个解决方案

解决方案1
0 2022-08-02 13:18:37

解决方案2
0 2022-08-02 13:22:10

解决方案3
0 2022-08-02 13:37:45

如何修改循环以便从 Pandas Python 中的 DataFrame 中的列中的值中获取 NaN 值？

问题描述

3 个解决方案

解决方案1 0 2022-08-02 13:18:37

解决方案2 0 2022-08-02 13:22:10

解决方案3 0 2022-08-02 13:37:45

解决方案1
0 2022-08-02 13:18:37

解决方案2
0 2022-08-02 13:22:10

解决方案3
0 2022-08-02 13:37:45