简体   繁体   English

为什么这不是python中的语法错误?

[英]Why isn't this a syntax error in python?

Noticed a line in our codebase today which I thought surely would have failed the build with syntax error, but tests were passing so apparently it was actually valid python (in both 2.x and 3). 今天在我们的代码库中注意到了一行,我认为肯定会因为语法错误而无法构建,但测试正在通过,所以显然它实际上是有效的python(在2.x和3中)。

Whitespace is sometimes not required in the conditional expression: 条件表达式有时不需要空格:

>>> 1if True else 0
1

It doesn't work if the LHS is a variable: 如果LHS是变量,它不起作用:

>>> x = 1
>>> xif True else 0
  File "<stdin>", line 1
    xif True else 0
           ^
SyntaxError: invalid syntax

But it does seem to still work with other types of literals: 但它似乎仍然适用于其他类型的文字:

>>> {'hello'}if False else 'potato'
'potato'

What's going on here, is it intentionally part of the grammar for some reason? 这里发生了什么,它是出于某种原因故意成为语法的一部分吗? Is this odd quirk a known/documented behaviour? 这个奇怪的怪癖是一种已知/记录的行为吗?

Whitespace between tokens 令牌之间有空格

Except at the beginning of a logical line or in string literals, the whitespace characters space, tab and formfeed can be used interchangeably to separate tokens. 除了在逻辑行的开头或字符串文字中,空白字符空格,制表符和换页符可以互换使用以分隔标记。 Whitespace is needed between two tokens only if their concatenation could otherwise be interpreted as a different token (eg, ab is one token, but ab is two tokens). 仅当两个令牌的串联可以被解释为不同的令牌时才需要空格(例如,ab是一个令牌,但是ab是两个令牌)。

So in this case, 1if is not a valid token, so the whitespace is optional. 所以在这种情况下, 1if不是有效的标记,因此空格是可选的。 The 1 is interpreted as an integer literal of which the if is not a part. 1被解释为整数文字,其中if不是一部分。 So if is interpreted separately and recognized as a keyword. 因此, if单独解释并识别为关键字。

In xif however, an identifier is recognized, so Python is not able to see that you wanted to do x if there. 但是在xif中,识别出一个标识符,因此Python无法看到你想要x if那里做x if

The Python lexer generates two tokens for the input 1if : the integer 1 and the keyword if , since no token that begins with a digit can contain the string if . Python词法分析器为输入1if生成两个标记:整数1和关键字if ,因为没有以数字开头的标记可以包含字符串if xif , on the other hand, is recognized as a valid identifier; 另一方面, xif被识别为有效标识符; there is no reason to believe that it is an identifier followed by a keyword, and so is passed to the parser as a single token. 没有理由相信它是一个后跟关键字的标识符,因此作为单个标记传递给解析器。

With my limited knowledge of lexical processing and tokenizing I'd say what you're seeing is that any piece that can be lexical parsed as "different" (ie numbers/dictionaries, etc...) from the if are being done so. 用我有限的词汇处理和符号化的知识,我会说,你看到的是,可以从被词法分析为“不同”(即数字/字典等)的任何一块if正在这样做。 Most languages ignore spaces and I imagine that Python does the same (excluding, of course indentation levels). 大多数语言都忽略了空格,我想Python也是如此(当然不包括缩进级别)。 Once tokens are generated the grammar itself doesn't care, it most likely looks for an [EXPRESSION] [IF] [EXPRESSION] [ELSE] [EXPRESSION] grouping, which, again with your examples, would work fine. 一旦生成了令牌,语法本身并不关心,它很可能会查找[EXPRESSION] [IF] [EXPRESSION] [ELSE] [EXPRESSION]分组,再次使用您的示例,它可以正常工作。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM