[英]How come regex match objects aren't iterable even though they implement __getitem__?
As you may know, implementing a __getitem__
method makes a class iterable : 您可能知道, 实现__getitem__
方法会使类可迭代 :
class IterableDemo:
def __getitem__(self, index):
if index > 3:
raise IndexError
return index
demo = IterableDemo()
print(demo[2]) # 2
print(list(demo)) # [0, 1, 2, 3]
print(hasattr(demo, '__iter__')) # False
However, this doesn't hold true for regex match objects: 但是,对于正则表达式匹配对象,这不适用:
>>> import re
>>> match = re.match('(ab)c', 'abc')
>>> match[0]
'abc'
>>> match[1]
'ab'
>>> list(match)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: '_sre.SRE_Match' object is not iterable
It's worth noting that this exception isn't thrown in the __iter__
method, because that method isn't even implemented: 值得注意的是, __iter__
方法中没有抛出此异常,因为该方法甚至没有实现:
>>> hasattr(match, '__iter__')
False
So, how is it possible to implement __getitem__
without making the class iterable? 那么,如何在不使类可迭代的情况下实现__getitem__
呢?
There are lies, damned lies and then there is Python documentation. 有谎言,该死的谎言然后有Python文档。
Having __getitem__
for a class implemented in C is not enough for it to be iterable. 对于在C中实现的类而言, __getitem__
不足以使其可迭代。 That is because there are actually 2 places in the PyTypeObject
where the __getitem__
can be mapped to: tp_as_sequence
and tp_as_mapping
. 这是因为居然有2个地方在PyTypeObject
在__getitem__
可以映射到: tp_as_sequence
和tp_as_mapping
。 Both have a slot for __getitem__
( [1] , [2] ). 两者都有__getitem__
的插槽( [1] , [2] )。
Looking at the source of the SRE_Match
, tp_as_sequence
is initialized to NULL
whereas tp_as_mapping
is defined. 查看SRE_Match
的源代码, tp_as_sequence
初始化为NULL
而tp_as_mapping
已定义。
The iter()
built-in function, if called with one argument, will call the PyObject_GetIter
, which has the following code: iter()
内置函数,如果使用一个参数调用,将调用PyObject_GetIter
,它具有以下代码:
f = t->tp_iter;
if (f == NULL) {
if (PySequence_Check(o))
return PySeqIter_New(o);
return type_error("'%.200s' object is not iterable", o);
}
It first checks the tp_iter
slot (obviously NULL
for _SRE_Match
objects); 它首先检查tp_iter
槽(显然是_SRE_Match
对象的NULL
); and failing that, then if PySequence_Check
returns true, a new sequence iterator, else a TypeError
is raised. 如果失败,那么如果 PySequence_Check
返回true, 则返回一个新的序列迭代器,否则会TypeError
。
PySequenceCheck
first checks if the object is a dict
or a dict
subclass - and returns false in that case. PySequenceCheck
首先检查对象是dict
还是dict
子类 - 在这种情况下返回false。 Otherwise it returns the value of 否则返回值
s->ob_type->tp_as_sequence &&
s->ob_type->tp_as_sequence->sq_item != NULL;
and since s->ob_type->tp_as_sequence
was NULL
for a _SRE_Match
instance, 0 will be returned, and PyObject_GetIter
raises TypeError: '_sre.SRE_Match' object is not iterable
. 并且因为对于_SRE_Match
实例, s->ob_type->tp_as_sequence
为NULL
, _SRE_Match
将返回0,并且PyObject_GetIter
引发TypeError: '_sre.SRE_Match' object is not iterable
。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.