[英]Python numpy nan compare to true in arrays with strings
I am trying to compare two numpy arrays which contain numbers, string and nans. 我想比较两个包含数字,字符串和nans的numpy数组。 I want to know how many items in the array are equal. 我想知道数组中有多少项是相等的。
When comparing these two arrays: 比较这两个数组时:
c =np.array([1,np.nan]);
d =np.array([2,np.nan]);
print (c==d)
[False False]
Which is the expected behaviour. 这是预期的行为。
However, when comparing: 但是,比较时:
a =np.array([1,'x', np.nan]);
b =np.array([1,'x', np.nan]);
print (a==b)
[ True True True]
That makes no sense to me, how can adding a string to the array change the way nans are compared? 这对我来说没有意义,如何在数组中添加字符串会改变nans的比较方式? Any ideas? 有任何想法吗?
Thanks! 谢谢!
If you examine the arrays, you'll see that np.nan
has been converted to string ( 'n'
): 如果检查数组,您将看到np.nan
已转换为字符串( 'n'
):
In [48]: a = np.array([1, 'x', np.nan])
In [49]: a
Out[49]:
array(['1', 'x', 'n'],
dtype='|S1')
And 'n' == 'n'
is True
. 并且'n' == 'n'
是True
。
What I don't understand is why changing the array's dtype
to object
doesn't change the result of the comparison: 我不明白为什么将数组的dtype
更改为object
不会改变比较的结果:
In [72]: a = np.array([1, 'x', np.nan], dtype=object)
In [73]: b = np.array([1, 'x', np.nan], dtype=object)
In [74]: a == b
Out[74]: array([ True, True, True], dtype=bool)
In [75]: a[2] == b[2]
Out[75]: False
In [76]: type(a[2])
Out[76]: float
In [77]: type(b[2])
Out[77]: float
It's almost as if the two NaN objects are compared by reference rather than by value: 这几乎就像两个NaN对象通过引用而不是值进行比较:
In [79]: id(a[2])
Out[79]: 26438340
In [80]: id(b[2])
Out[80]: 26438340
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.