繁体   English   中英

TypeError:输入类型不支持 ufunc 'isnan',无法安全地强制输入

[英]TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced

我正在尝试将 csv 转换为 numpy 数组。 在 numpy 数组中,我用 NaN 替换了几个元素。 然后,我想在 numpy 数组中找到 NaN 元素的索引。 代码是:

    import pandas as pd
import matplotlib.pyplot as plyt
import numpy as np

filename = 'wether.csv'

df = pd.read_csv(filename,header = None )

list = df.values.tolist()
labels = list[0]
wether_list = list[1:]

year = []
month = []
day = []
max_temp = []

for i in wether_list:
    year.append(i[1])
    month.append(i[2])
    day.append(i[3])
    max_temp.append(i[5])

mid = len(max_temp) // 2
temps = np.array(max_temp[mid:])
temps[np.where(np.array(temps) == -99.9)] = np.nan
plyt.plot(temps,marker = '.',color = 'black',linestyle = 'none')
# plyt.show()

print(np.where(np.isnan(temps))[0])
# print(len(pd.isnull(np.array(temps))))

当我执行此操作时,我收到警告和错误。 警告是:

    wether.py:26: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison
  temps[np.where(np.array(temps) == -99.9)] = np.nan

错误是:

Traceback (most recent call last):
  File "wether.py", line 30, in <module>
    print(np.where(np.isnan(temps))[0])
TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

这是我正在使用的数据集的一部分:

83168,2014,9,7,0.00000,89.00000,78.00000, 83.50000
83168,2014,9,22,1.62000,90.00000,72.00000, 81.00000
83168,2014,9,23,0.50000,87.00000,74.00000, 80.50000
83168,2014,9,24,0.35000,82.00000,73.00000, 77.50000
83168,2014,9,25,0.60000,85.00000,75.00000, 80.00000
83168,2014,9,26,0.76000,89.00000,77.00000, 83.00000
83168,2014,9,27,0.00000,89.00000,79.00000, 84.00000
83168,2014,9,28,0.00000,90.00000,81.00000, 85.50000
83168,2014,9,29,0.00000,90.00000,79.00000, 84.50000
83168,2014,9,30,0.50000,89.00000,75.00000, 82.00000
83168,2014,10,1,0.02000,91.00000,75.00000, 83.00000
83168,2014,10,2,0.03000,93.00000,77.00000, 85.00000
83168,2014,10,3,1.40000,93.00000,75.00000, 84.00000
83168,2014,10,4,0.06000,89.00000,75.00000, 82.00000
83168,2014,10,5,0.22000,91.00000,68.00000, 79.50000
83168,2014,10,6,0.00000,84.00000,68.00000, 76.00000
83168,2014,10,7,0.17000,85.00000,73.00000, 79.00000
83168,2014,10,8,0.06000,84.00000,73.00000, 78.50000
83168,2014,10,9,0.00000,87.00000,73.00000, 80.00000
83168,2014,10,10,0.00000,88.00000,80.00000, 84.00000
83168,2014,10,11,0.00000,87.00000,80.00000, 83.50000
83168,2014,10,12,0.00000,88.00000,80.00000, 84.00000
83168,2014,10,13,0.00000,88.00000,81.00000, 84.50000
83168,2014,10,14,0.04000,88.00000,77.00000, 82.50000
83168,2014,10,15,0.00000,88.00000,77.00000, 82.50000
83168,2014,10,16,0.09000,89.00000,72.00000, 80.50000
83168,2014,10,17,0.00000,85.00000,67.00000, 76.00000
83168,2014,10,18,0.00000,84.00000,65.00000, 74.50000
83168,2014,10,19,0.00000,84.00000,65.00000, 74.50000
83168,2014,10,20,0.00000,85.00000,69.00000, 77.00000
83168,2014,10,21,0.77000,87.00000,76.00000, 81.50000
83168,2014,10,22,0.69000,81.00000,71.00000, 76.00000
83168,2014,10,23,0.31000,82.00000,72.00000, 77.00000
83168,2014,10,24,0.71000,79.00000,73.00000, 76.00000
83168,2014,10,25,0.00000,81.00000,68.00000, 74.50000
83168,2014,10,26,0.00000,82.00000,67.00000, 74.50000
83168,2014,10,27,0.00000,83.00000,64.00000, 73.50000
83168,2014,10,28,0.00000,83.00000,66.00000, 74.50000
83168,2014,10,29,0.03000,86.00000,76.00000, 81.00000
83168,2014,10,30,0.00000,85.00000,69.00000, 77.00000
83168,2014,10,31,0.00000,85.00000,69.00000, 77.00000
83168,2014,11,1,0.00000,86.00000,59.00000, 72.50000
83168,2014,11,2,0.00000,77.00000,52.00000, 64.50000
83168,2014,11,3,0.00000,70.00000,52.00000, 61.00000
83168,2014,11,4,0.00000,77.00000,59.00000, 68.00000
83168,2014,11,5,0.02000,79.00000,73.00000, 76.00000
83168,2014,11,6,0.02000,82.00000,75.00000, 78.50000
83168,2014,11,7,0.00000,83.00000,66.00000, 74.50000
83168,2014,11,8,0.00000,84.00000,65.00000, 74.50000
83168,2014,11,9,0.00000,84.00000,65.00000, 74.50000
83168,2014,11,10,1.20000,72.00000,65.00000, 68.50000
83168,2014,11,11,0.08000,77.00000,61.00000, 69.00000
83168,2014,11,12,0.00000,80.00000,61.00000, 70.50000
83168,2014,11,13,0.00000,83.00000,63.00000, 73.00000
83168,2014,11,14,0.00000,83.00000,65.00000, 74.00000
83168,2014,11,15,0.00000,82.00000,64.00000, 73.00000
83168,2014,11,16,0.00000,83.00000,64.00000, 73.50000
83168,2014,11,17,0.07000,84.00000,64.00000, 74.00000
83168,2014,11,18,0.00000,86.00000,71.00000, 78.50000
83168,2014,11,19,0.57000,78.00000,55.00000, 66.50000
83168,2014,11,20,0.05000,72.00000,56.00000, 64.00000
83168,2014,11,21,0.05000,77.00000,63.00000, 70.00000
83168,2014,11,22,0.22000,77.00000,69.00000, 73.00000
83168,2014,11,23,0.06000,79.00000,76.00000, 77.50000
83168,2014,11,24,0.02000,84.00000,78.00000, 81.00000
83168,2014,11,25,0.00000,86.00000,78.00000, 82.00000
83168,2014,11,26,0.07000,85.00000,77.00000, 81.00000
83168,2014,11,27,0.21000,82.00000,55.00000, 68.50000
83168,2014,11,28,0.00000,73.00000,53.00000, 63.00000
83168,2015,1,8,0.00000,80.00000,57.00000,
83168,2015,1,9,0.05000,72.00000,56.00000,
83168,2015,1,10,0.00000,72.00000,57.00000,
83168,2015,1,11,0.00000,80.00000,57.00000,
83168,2015,1,12,0.05000,80.00000,59.00000,
83168,2015,1,13,0.85000,81.00000,69.00000,
83168,2015,1,14,0.05000,81.00000,68.00000,
83168,2015,1,15,0.00000,81.00000,64.00000,
83168,2015,1,16,0.00000,78.00000,63.00000,
83168,2015,1,17,0.00000,73.00000,55.00000,
83168,2015,1,18,0.00000,76.00000,55.00000,
83168,2015,1,19,0.00000,78.00000,55.00000,
83168,2015,1,20,0.00000,75.00000,56.00000,
83168,2015,1,21,0.02000,73.00000,65.00000,
83168,2015,1,22,0.00000,80.00000,64.00000,
83168,2015,1,23,0.00000,80.00000,71.00000,
83168,2015,1,24,0.00000,79.00000,72.00000,
83168,2015,1,25,0.00000,79.00000,49.00000,
83168,2015,1,26,0.00000,79.00000,49.00000,
83168,2015,1,27,0.10000,75.00000,53.00000,
83168,2015,1,28,0.00000,68.00000,53.00000,
83168,2015,1,29,0.00000,69.00000,53.00000,
83168,2015,1,30,0.00000,72.00000,60.00000,
83168,2015,1,31,0.00000,76.00000,58.00000,
83168,2015,2,1,0.00000,76.00000,58.00000,
83168,2015,2,2,0.05000,77.00000,58.00000,
83168,2015,2,3,0.00000,84.00000,56.00000,
83168,2015,2,4,0.00000,76.00000,56.00000,

我无法纠正错误。 如何克服第26行的警告? 如何解决这个错误?

更新:当我以不同的方式尝试相同的事情时,比如从文件中读取数据集而不是转换为数据帧,我没有收到错误。 那是什么原因呢? 代码是:

    weather_filename = 'wether.csv'
weather_file = open(weather_filename)
weather_data = weather_file.read()
weather_file.close()

# Break the weather records into lines
lines = weather_data.split('\n')
labels = lines[0]
values = lines[1:]
n_values = len(values)

# Break the list of comma-separated value strings
# into lists of values.
year = []
month = []
day = []
max_temp = []
j_year = 1
j_month = 2
j_day = 3
j_max_temp = 5

for i_row in range(n_values):
    split_values = values[i_row].split(',')
    if len(split_values) >= j_max_temp:
        year.append(int(split_values[j_year]))
        month.append(int(split_values[j_month]))
        day.append(int(split_values[j_day]))
        max_temp.append(float(split_values[j_max_temp]))

# Isolate the recent data.
i_mid = len(max_temp) // 2
temps = np.array(max_temp[i_mid:])
year = year[i_mid:]
month = month[i_mid:]
day = day[i_mid:]
temps[np.where(temps == -99.9)] = np.nan

# Remove all the nans.
# Trim both ends and fill nans in the middle.
# Find the first non-nan.
i_start = np.where(np.logical_not(np.isnan(temps)))[0][0]
temps = temps[i_start:]
year = year[i_start:]
month = month[i_start:]
day = day[i_start:]
i_nans = np.where(np.isnan(temps))[0]
print(i_nans)

第一个代码有什么问题,为什么第二个代码甚至不发出警告?

发布,因为它可能会帮助未来的用户。

正如其他人正确指出的那样, np.isnan 不适用于objectstring数据类型。 如果您使用的是 pandas,如此处所述,您可以直接使用pd.isnull ,这应该适用于您的情况。

import pandas as pd
import numpy as np
var1 = ''
var2 = np.nan
>>> type(var1)
<class 'str'>
>>> type(var2)
<class 'float'>
>>> pd.isnull(var1)
False
>>> pd.isnull(var2)
True

尝试用np.isnan替换pd.isna Pandas 的isna支持类别数据类型

tempsdtype是什么。 我可以使用字符串 dtype 重现您的警告和错误:

In [26]: temps = np.array([1,2,'string',0])
In [27]: temps
Out[27]: array(['1', '2', 'string', '0'], dtype='<U21')
In [28]: temps==-99.9
/usr/local/bin/ipython3:1: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison
  #!/usr/bin/python3
Out[28]: False
In [29]: np.isnan(temps)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-29-2ff7754ed926> in <module>()
----> 1 np.isnan(temps)

TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

首先,将字符串与数字进行比较会给出这个未来警告。

其次,测试nan会产生错误。

请注意,给定dtypenan赋值分配一个字符串值,而不是一个浮点数( np.nan是一个浮点数)。

In [30]: temps[-1] = np.nan
In [31]: temps
Out[31]: array(['1', '2', 'string', 'nan'], dtype='<U21')

这可能是不需要的浮点数到字符串转换的结果。 要修复它,只需使用floatnp.float64添加字符串到浮点数的转换(假设数据可转换为数字)来反转它:

np.isnan(float(str(np.nan)))
True

要么

np.isnan(float(str("nan")))
True

而不是:

np.isnan(str(np.nan))
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In [164], line 1
----> 1 np.isnan(str(np.nan))

TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

请注意,如果您的数据不能转换为数字(浮点数),则需要使用字符串兼容函数,例如pd.isna而不是np.isnan

isnan(ndarray)在“对象”的isnan(ndarray)失败

isnan(ndarray.astype(np.float)) ,但字符串不能被强制为浮动。

我在尝试使用sklearn.preprocessing.OneHotEncoder transform我的数据集时遇到了这个错误。 该错误是由 sklearn.utils._encode 中定义的_check_unknown函数sklearn.utils._encode的。

这是因为在transform时,要转换的列之一的类型是float64而不是object - 在我的例子中,整个列都是NaN

解决方案是在调用transform之前将数据帧转换为object类型:

ohe.transform(data.astype("O"))

注意:此答案与问题的标题有些相关,因为在使用 Decimal 类型时会提示此错误。

在考虑Decimal类型值时,我遇到了同样的错误。 出于某种原因,我正在考虑的dataframe一列是十进制的。 例如,在此列上调用.unique() ,我得到了

[Decimal('0'), Decimal('95'), Decimal('38'), Decimal('25'),
 Decimal('42'), Decimal('11'), Decimal('18'), Decimal('22'), 
.....Decimal('220'), Decimal('724')]

由于错误的回溯显示我在调用某些numpy函数时失败。 我设法通过考虑上述数组的minmax来重现错误

from decimal import Decimal

xmin, xmax = Decimal('0'), Decimal('724')
np.isnan([xmin, xmax])

它会提示错误

TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

这种情况下的解决方案是将所有这些值转换为int

df.astype({col:int for col in desired_columns_to_convert})

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM