简体   繁体   English

Python numpy nan在带有字符串的数组中与true相比较

[英]Python numpy nan compare to true in arrays with strings

I am trying to compare two numpy arrays which contain numbers, string and nans. 我想比较两个包含数字,字符串和nans的numpy数组。 I want to know how many items in the array are equal. 我想知道数组中有多少项是相等的。

When comparing these two arrays: 比较这两个数组时:

c =np.array([1,np.nan]);
d =np.array([2,np.nan]);
print (c==d)
[False False]

Which is the expected behaviour. 这是预期的行为。

However, when comparing: 但是,比较时:

a =np.array([1,'x', np.nan]);
b =np.array([1,'x', np.nan]);
print (a==b)
[ True  True  True]

That makes no sense to me, how can adding a string to the array change the way nans are compared? 这对我来说没有意义,如何在数组中添加字符串会改变nans的比较方式? Any ideas? 有任何想法吗?

Thanks! 谢谢!

If you examine the arrays, you'll see that np.nan has been converted to string ( 'n' ): 如果检查数组,您将看到np.nan已转换为字符串( 'n' ):

In [48]: a = np.array([1, 'x', np.nan])

In [49]: a
Out[49]: 
array(['1', 'x', 'n'], 
      dtype='|S1')

And 'n' == 'n' is True . 并且'n' == 'n'True

What I don't understand is why changing the array's dtype to object doesn't change the result of the comparison: 我不明白为什么将数组的dtype更改为object不会改变比较的结果:

In [72]: a = np.array([1, 'x', np.nan], dtype=object)

In [73]: b = np.array([1, 'x', np.nan], dtype=object)

In [74]: a == b
Out[74]: array([ True,  True,  True], dtype=bool)

In [75]: a[2] == b[2]
Out[75]: False

In [76]: type(a[2])
Out[76]: float

In [77]: type(b[2])
Out[77]: float

It's almost as if the two NaN objects are compared by reference rather than by value: 这几乎就像两个NaN对象通过引用而不是值进行比较:

In [79]: id(a[2])
Out[79]: 26438340

In [80]: id(b[2])
Out[80]: 26438340

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM