[英]How should I remove nan values from a dataframe in python?
I've got an excel file and I created lists from its columns.我有一个 excel 文件,我从它的列中创建了列表。 The problem is the rows of the columns is not equal.
问题是列的行不相等。 Therefore, I have multiple 'nan' values at ends of the lists.
因此,我在列表末尾有多个“nan”值。 I tried to delete them with dropna() method but there are still the 'nan' values.
我尝试使用 dropna() 方法删除它们,但仍然存在“nan”值。 Here is my code:
这是我的代码:
import pandas as pd
excel_name = r'file_name.xlsx'
df = pd.read_excel(excel_name, engine='openpyxl')
df.dropna()
clomun_1 = list(df['clomun1'])
clomun_2 = list(df['clomun2'])
clomun_3 = list(df['clomun3'])
print(clomun_1)
print(clomun_2)
print(clomun_3)
output: output:
clomun_1 = ['value1', 'value2', 'value3', 'value4', 'nan', 'nan', 'nan', 'nan']
clomun_2 = ['value1', 'value2', 'value3', 'value4', 'value5', 'value6', 'nan', 'nan']
clomun_3 = ['value1', 'value2', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan']
I want to keep only values.我只想保留价值观。 I must delete "nan" elements.
我必须删除“nan”元素。
Try this:尝试这个:
df = pd.read_excel(excel_name, engine='openpyxl', na_values=['nan']) #add na_values
clomun_1 = df['clomun1'].dropna().tolist()
print(clomun_1)
['value1', 'value2', 'value3', 'value4']
You can use a lambda function to achieve this.您可以使用 lambda function 来实现此目的。
clomun_1_new= [x for x in clomun_1 if x!='nan']
repeat the same for other two lists.对其他两个列表重复相同的操作。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.