[英]Remove NaN from lists in python
I have converted data frame rows to lists and in those list there are NaN values which I would like to remove.我已将数据框行转换为列表,在这些列表中有我想删除的 NaN 值。 This is my attempt but the NaN values are not removed
这是我的尝试,但未删除 NaN 值
import pandas as pd
df = pd.read_excel('MainFile.xlsx', dtype = str)
df_list = df.values.tolist()
print(df_list)
print('=' * 50)
for l in df_list:
newlist = [x for x in l if x != 'nan']
print(newlist)
Here's a snapshot of the original data这是原始数据的快照
I could find a solution using these lines (but I welcome any ideas)我可以使用这些行找到解决方案(但我欢迎任何想法)
for l in df_list:
newlist = [x for x in l if x == x]
print(newlist)
It is not working because you are trying to compare it to string 'nan'.它不起作用,因为您试图将它与字符串“nan”进行比较。
If excel cell is empty it is returned as NaN
value in pandas.如果 excel 单元格为空,则它在 pandas 中作为
NaN
值返回。
You can use numpy library, to compare it with NaN
:您可以使用 numpy 库,将其与
NaN
进行比较:
import numpy as np
for l in df_list:
newlist = [x for x in l if x != np.nan]
print(newlist)
EDIT:编辑:
If you want to get all values from the dataframe, which are not NaN
, you can just do:如果你想从 dataframe 中获取所有不是
NaN
的值,你可以这样做:
df.stack().tolist()
If you want to print values with the loop (as in your example), you can do:如果你想用循环打印值(如你的例子),你可以这样做:
for l in df.columns:
print(list(df[l][df[l].notna()]))
To create nested list with a loop:要使用循环创建嵌套列表:
main = []
for l in df.T.columns:
new_list = list(df.T[l][df.T[l].notna()])
main.append(new_list)
print(main)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.