简体   繁体   中英

Formal syntax of Python's extended slice notation?

Numpy, for example, allows multi-dimensional slices:

a[:, 0, 7:9]

This raises the question: what else is possible? (Imagine the possibilities!)

According to this answer and some experimentation (see below), if there is a comma, Python builds a tuple of objects, some of which may be slice objects, and passes it (as key ) to __getitem__(self, key) of a .

The documentation for __getitem__(..) doesn't specify this behaviour. Is there any official documentation that I missed? In particular, how backwards-compatible is this syntax? (Searching the web for "python extended slice notation" gives "What's new in Python 2.3" , which doesn't mention it.)


Experimentation

>>> class Test(object):
...     def __getitem__(self, x):
...         print repr(x)


>>> t = Test()

First, things that Python finds recognisable for multi-slicing:

>>> t[1]
1

>>> t['a':,]
(slice('a', None, None),)

>>> t['a':7:('b','c'),]
(slice('a', 7, ('b', 'c')),)

# Seems like it can be arbitrary objects?
>>> t[(t,t):[4,5]]
slice((<__main__.Test object at 0x07D04950>, <__main__.Test object at 0x07D04950>), [4, 5], None)

>>> t[::]
slice(None, None, None)

>>> t[:]
slice(None, None, None)

>>> t[::,1,::,::,:,:,:]
(slice(None, None, None), 1, slice(None, None, None), slice(None, None, None), slice(None, None, None),  slice(None, None, None), slice(None, None, None))

>>> t[...]
Ellipsis

>>> t[... , ...]
(Ellipsis, Ellipsis)

>>> t[  .   .      .    ]
Ellipsis

Some things that are NOT allowed (SyntaxError):

# Semicolon delimiter
t['a':5; 'b':7:-7]
# Slice within a slice
t['a':7:(9:5),]
# Two trailing commas
t[5,,]
# Isolated comma
t[,]
# Leading comma
t[,5]
# Empty string
t[]
# Triple colon
t[:::]
# Ellipses as part of a slice
t[1:...]
t[1:2:...]
# Ellipses inside no-op parens:
t[(...)]
# Any non-zero and non-three number of dots:
t[.]
t[..]
t[ .  .  .  . ]

Anything is possible, as long as it is a valid Python expression. The object produced by the expression between [...] is passed to the __getitem__ method. That's it.

Commas produce a tuple, : colons in an expression produce a slice() object. Beyond that, use whatever you want.

That's because the grammar allows for any expression_list in the notation. See the reference documentation :

 subscription ::= primary "[" expression_list "]" 

Slicing is further specified in the Slicings section :

 slicing ::= primary "[" slice_list "]" slice_list ::= slice_item ("," slice_item)* [","] slice_item ::= expression | proper_slice proper_slice ::= [lower_bound] ":" [upper_bound] [ ":" [stride] ] lower_bound ::= expression upper_bound ::= expression stride ::= expression 

So again arbitrary expression s are allowed, and : triggers the proper_slice grammar rule.

Note that the lower_bound , upper_bound and stride expression results are used to construct a slice() object, which can only handle integer values . Anything that can't be converted to an integer will result in a TypeError being raised. That's not the same thing as a syntax error; t[1:...] is syntactically just fine, but ... is not convertable to an integer so you get a runtime TypeError exception. Your two examples using non-integer slice values are not possible on Python versions 2.4 and up at the very least.

Your actual syntax errors all stem from invalid expressions. Apart from the : proper_slice notation, if you can't put the part between [...] on the right-hand side of an assignment, you can't use it in a slice either.

For example, ; can only be used to put multiple simple statements on a single logical line. Statements can contain expressions, but expressions can never contain statements, excluding ; from expressions. (9:5), is not a valid expression (nowhere else in Python could you use a : in parentheses, the parenth_form rule doesn't allow for any such options).

The Python 2 grammar for slicings is a little more elaborate in that ... is a specific notation in the grammar there, and you can't actually use the ... outside of slicings (in Python 3 you can use ... anywhere an expression is valid), which is why t[(...)] is a syntax error in Python 2 but not in Python 3.

To add to the earlier answer. If you define

class Foo:
    def __getitem__(self, key):
        return key

and do Foo()[0, :, 1:2, 1:2:3] it will give you the internal representation which is:

>>> Foo()[0, :, 1:2, 1:2:3]
(0, slice(None, None, None), slice(1, 2, None), slice(1, 2, 3))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM