简体   繁体   English

从 python 的列表中删除 NaN

[英]Remove NaN from lists in python

I have converted data frame rows to lists and in those list there are NaN values which I would like to remove.我已将数据框行转换为列表,在这些列表中有我想删除的 NaN 值。 This is my attempt but the NaN values are not removed这是我的尝试,但未删除 NaN 值

import pandas as pd

df = pd.read_excel('MainFile.xlsx', dtype = str)
df_list = df.values.tolist()
print(df_list)
print('=' * 50)

for l in df_list:
    newlist = [x for x in l if x != 'nan']
    print(newlist)

Here's a snapshot of the original data这是原始数据的快照在此处输入图像描述

I could find a solution using these lines (but I welcome any ideas)我可以使用这些行找到解决方案(但我欢迎任何想法)

for l in df_list:
    newlist = [x for x in l if x == x]
    print(newlist)

It is not working because you are trying to compare it to string 'nan'.它不起作用,因为您试图将它与字符串“nan”进行比较。

If excel cell is empty it is returned as NaN value in pandas.如果 excel 单元格为空,则它在 pandas 中作为NaN值返回。

You can use numpy library, to compare it with NaN :您可以使用 numpy 库,将其与NaN进行比较:

import numpy as np

for l in df_list:
    newlist = [x for x in l if x != np.nan]
    print(newlist)

EDIT:编辑:

If you want to get all values from the dataframe, which are not NaN , you can just do:如果你想从 dataframe 中获取所有不是NaN的值,你可以这样做:

df.stack().tolist()

If you want to print values with the loop (as in your example), you can do:如果你想用循环打印值(如你的例子),你可以这样做:

for l in df.columns:
    print(list(df[l][df[l].notna()]))

To create nested list with a loop:要使用循环创建嵌套列表:

main = []

for l in df.T.columns:
    new_list = list(df.T[l][df.T[l].notna()])
    main.append(new_list)
    
print(main)

You can always try the approach that is proposed here :您可以随时尝试此处提出的方法:

import numpy as np

newlist = [x for x in df_list if np.isnan(x) == False]
print(newlist)

I hope that this will help.我希望这会有所帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM