Python numpy nan在带有字符串的数组中与true相比较

Question

I am trying to compare two numpy arrays which contain numbers, string and nans. 我想比较两个包含数字，字符串和nans的numpy数组。 I want to know how many items in the array are equal. 我想知道数组中有多少项是相等的。

When comparing these two arrays: 比较这两个数组时：

c =np.array([1,np.nan]);
d =np.array([2,np.nan]);
print (c==d)
[False False]

Which is the expected behaviour. 这是预期的行为。

However, when comparing: 但是，比较时：

a =np.array([1,'x', np.nan]);
b =np.array([1,'x', np.nan]);
print (a==b)
[ True  True  True]

That makes no sense to me, how can adding a string to the array change the way nans are compared? 这对我来说没有意义，如何在数组中添加字符串会改变nans的比较方式？ Any ideas? 有任何想法吗？

Thanks! 谢谢！

Answer 1

If you examine the arrays, you'll see that np.nan has been converted to string ( 'n' ): 如果检查数组，您将看到np.nan已转换为字符串（ 'n' ）：

In [48]: a = np.array([1, 'x', np.nan])

In [49]: a
Out[49]: 
array(['1', 'x', 'n'], 
      dtype='|S1')

And 'n' == 'n' is True . 并且'n' == 'n'是True 。

What I don't understand is why changing the array's dtype to object doesn't change the result of the comparison: 我不明白为什么将数组的dtype更改为object不会改变比较的结果：

In [72]: a = np.array([1, 'x', np.nan], dtype=object)

In [73]: b = np.array([1, 'x', np.nan], dtype=object)

In [74]: a == b
Out[74]: array([ True,  True,  True], dtype=bool)

In [75]: a[2] == b[2]
Out[75]: False

In [76]: type(a[2])
Out[76]: float

In [77]: type(b[2])
Out[77]: float

It's almost as if the two NaN objects are compared by reference rather than by value: 这几乎就像两个NaN对象通过引用而不是值进行比较：

In [79]: id(a[2])
Out[79]: 26438340

In [80]: id(b[2])
Out[80]: 26438340

Python numpy nan在带有字符串的数组中与true相比较

问题描述

1 个解决方案

解决方案1
1 已采纳 2014-11-29 17:53:04

Python numpy nan在带有字符串的数组中与true相比较

问题描述

1 个解决方案

解决方案1 1 已采纳 2014-11-29 17:53:04

解决方案1
1 已采纳 2014-11-29 17:53:04