简体   繁体   English

使用包含数字的字符串进行索引时,2D numpy数组不会出错

[英]2D numpy array does not give an error when indexing with strings containing digits

When I create a one dimensional array in numpy and use a string (containing digits) to index it, I get an error as expected: 当我在numpy中创建一维数组并使用一个字符串(包含数字)来索引它时,我得到了一个错误:

>>> import numpy as np
>>> a = np.arange(15)
>>> a['10']
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: field named 10 not found.

However, when I create a two dimensional array and use two strings for indexing, it gives no error and returns the element as if the strings are converted to integers first 但是,当我创建一个二维数组并使用两个字符串进行索引时,它不会给出错误并返回该元素,就像首先将字符串转换为整数一样

>>> b = np.arange(15).reshape(3,5)
>>> b
array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14]])
>>> b[1, 2]
7
>>> b['1', '2']
7

What's going on? 这是怎么回事? Why don't I get an error in the two dimensional case? 为什么我在二维情况下没有出错?

disclaimer -- this answer is bound to be incomplete 免责声明 - 这个答案肯定是不完整的

I think what you're seeing is a consequence of fancy sequence indexing . 我认为你所看到的是花式序列索引的结果 Since strings are actually sequences, you're getting the values of the string one character at a time and converting them to " intp " objects (which presumably just uses python's int function)-- which is then giving you your array index. 因为字符串实际上是序列,所以你一次得到一个字符串的值并将它们转换为“ intp ”对象(可能只是使用python的int函数) - 然后它会给你数组索引。

This also explains the 1D case: 这也解释了1D案例:

class Foo(object):
    def __getitem__(self,idx):
        print idx

a = Foo()
a[12]
a[12,12]

Note that in the second case a tuple is passed whereas in the first case an integer is passed. 注意,在第二种情况下传递tuple ,而在第一种情况下传递整数。


The piece of this that I still don't understand is demonstrated by this test: 这个测试证明了我仍然不理解的部分:

import numpy as np
a = np.arange(156).reshape(13,12)
print a[12,3] == a['12',3]   #True -- I would have thought False for this one...
print a['12',3] == a[('1','2'),3]  #False -- I would have guessed True for this..
assert( a[tuple('12'),3] == a[(1,2),3] )  #This passes, as expected

Feel free to try to explain this one to me in comments. 请在评论中尝试向我解释这个。 :) The discrepancy might be that numpy deliberately leaves strings alone when converting to a sequence of intp objects in order to more smoothly handle record arrays... :)差异可能是numpy在转换为一系列intp对象时故意留下字符串,以便更顺畅地处理记录数组...

Just to add, note that the first case (a single string), is probably to do with support for recarrays, which use strings as field names. 只是要添加,请注意第一种情况(单个字符串),可能与支持重新排列有关,它使用字符串作为字段名称。

Please do not rely on the second case. 不要依赖于第二种情况。 Numpy is extremely free about indexing with non-arrays, since if it is a non-array (and not a slice and not None), it will simply try to convert it into an integer array, which is well defined for these strings. Numpy非常自由地使用非数组进行索引,因为如果它是非数组(而不是切片而不是None),它将只是尝试将其转换为整数数组,这是为这些字符串定义的。 However this is not by design, its because too much software relies on this behaviour (at least partially) to actually change it, and quite honestly, while this make somewhat make sense for floats which are forgotten to be cast, it really doesn't for strings. 然而,这不是设计,因为太多的软件依赖于这种行为(至少部分地)来实际改变它,而且老实说,虽然这对于忘记投射的浮动有些有意义,但它确实没有对于字符串。


Some more details for @mgilson. @mgilson的更多细节。 considering that all of this is off label usage, it really cooks down to implementation details. 考虑到所有这些都是标签使用,它真的很适合实现细节。 For example a single string is currently special cased for recarrays even if its not a recarray, but a tuple of strings is only special cased for recarrays. 例如,单个字符串当前特别用于重新排列,即使它不是重新排列,但字符串元组只是特殊的重新排列。

Now a list of strings, is somewhat special cased, since they are not tuples, but act like one most of the time. 现在一个字符串列表,有些特殊,因为它们不是元组,但大部分时间都像一个。 This may be a small bug... Because it finds a sequence inside of it, it triggers fancy indexing, but "forgets" to convert it to an array. 这可能是一个小错误...因为它在其中找到一个序列,它会触发花哨的索引,但“忘记”将其转换为数组。 Though I would generally use tuples to denote multiple axes. 虽然我通常会使用元组来表示多个轴。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM