Python解析器/编译器与解释器，以及字符串串联编译时与运行时？

Question

At this spot in this articl e by one of the major Python people, the author notes that automatic string concatenation is a feature of the parser/compiler as opposed to the interpreter, which is why you must use + to concatenate strings at runtime. 在一位主要的Python专家的这篇文章中，作者注意到自动字符串串联是解析器/编译器的功能，而不是解释器，这就是为什么必须在运行时使用+来串联字符串的原因。

I don't understand anything about that. 我对此一无所知。 I know you can concatenate with + and I know two string literals side by side are auto-concatenated and I know you of course can't do that with variables containing strings but I have no idea what the difference is between a parser/compiler and an interpreter (for python, or in general) and I have no idea how it ties in to this whole string concatenation thing. 我知道您可以用+串联，并且我知道两个字符串文字是自动串联的，并且我当然知道您不能对包含字符串的变量执行此操作，但是我不知道解析器/编译器和一个解释器（适用于python或一般而言），我不知道它如何与整个字符串连接有关。

Explanation??? 说明？？？

Answer 1

Python is an interpreted language (as opposed to languages like C++ that are compiled to machine code before execution). Python是一种解释性语言（与诸如C ++之类的在执行之前被编译为机器代码的语言相反）。

Now there is an intermediate step: The source (text) files are compiled to bytecode, and that bytecode is then run by the Python interpreter. 现在有一个中间步骤：源（文本）文件被编译为字节码，然后该字节码由Python解释器运行。

Verbatim string concatenation (as in "a" "b" becoming "ab" ) is already done by the bytecode compiler. 逐字字符串连接（如"a" "b"变为"ab" ）已由字节码编译器完成。 The same goes for "a" + "b" because the compiler can already figure out the literal values: "a" + "b"因为编译器已经可以计算出文字值了：

>>> import dis
>>> def s(): print "a" "b"
...
>>> dis.dis(s)
  1           0 LOAD_CONST               1 ('ab')
              3 PRINT_ITEM
              4 PRINT_NEWLINE
              5 LOAD_CONST               0 (None)
              8 RETURN_VALUE
>>> def s(): print "ab"
...
>>> dis.dis(s)
  1           0 LOAD_CONST               1 ('ab')
              3 PRINT_ITEM
              4 PRINT_NEWLINE
              5 LOAD_CONST               0 (None)
              8 RETURN_VALUE
>>> def s(): print "a"+"b"
...
>>> dis.dis(s)
  1           0 LOAD_CONST               3 ('ab')
              3 PRINT_ITEM
              4 PRINT_NEWLINE
              5 LOAD_CONST               0 (None)
              8 RETURN_VALUE

But for values that can't trivially be inferred at compile time, it's the interpreter's job to do the concatenation: 但是对于无法在编译时轻松推断的值，进行串联是解释器的工作：

>>> def s(): print "a" + chr(98)
...
>>> dis.dis(s)
  1           0 LOAD_CONST               1 ('a')
              3 LOAD_GLOBAL              0 (chr)
              6 LOAD_CONST               2 (98)
              9 CALL_FUNCTION            1
             12 BINARY_ADD
             13 PRINT_ITEM
             14 PRINT_NEWLINE
             15 LOAD_CONST               0 (None)
             18 RETURN_VALUE
>>> s()
ab

Answer 2

When Python code is being translated into byte-code side-by-side strings are being merged. 将Python代码转换为字节代码时，将合并并排字符串。 This is done only once - every time you'll run the script without deleting the precompiled pyc the concatenation result will be there. 这仅执行一次-每次运行脚本而不删除预编译的pyc ，连接结果都将存在。 Even without the precompiled file, the concatenation result will be placed in the byte-code, so still each time this code (eg a function) is being run there is no need to calculate the result of concatenation. 即使没有预编译的文件，连接结果也将放置在字节码中，因此仍在每次运行此代码（例如函数）时，都无需计算连接结果。

If you use + on the other hand, the byte-code will contain both strings, and the expression will be evaluated every time this code is being run. 另一方面，如果您使用+ ，则字节码将包含两个字符串，并且每次运行该代码时都会对表达式进行求值。 EDIT : not always as noted by Tim Pietzcker in his answer - however in such case it's a matter of compiler's optimization, not behaviour guaranteed to always happen by language semantics. 编辑：并非总是如蒂姆·皮茨克（Tim Pietzcker）在他的回答中所指出的-但是，在这种情况下，这是编译器的优化问题，并非由语言语义保证的行为总是会发生。

Note that because syntax is part of the language definition, the differentiation between compiler and interpreter is irrelevant here. 请注意，由于语法是语言定义的一部分，因此此处的编译器和解释器之间的区别无关紧要。

Reference: lexical analysis in Python 参考： Python中的词法分析

Answer 3

A compiled language (EG: C, C++) translates human-readable source code into machine-readable machine code. 编译语言（例如：C，C ++）将人类可读的源代码转换成机器可读的机器代码。

An interpreted language (EG: old microsoft BASIC on 6502's) recomputes what a step needs to do, each time that step is executed. 每次执行该步骤时，一种解释性语言（例如，EG：6502上的旧microsoft BASIC）都会重新计算该步骤需要执行的操作。

A middle ground exists. 存在中间立场。 Languages like Python and Java compile, but they don't compile to machine code; 诸如Python和Java之类的语言可以编译，但是它们不能编译为机器代码。 instead they compile to an idealised, software-only machine's byte code. 相反，它们编译为理想的纯软件机器的字节码。 This gives great portability, and decent speed, especially if combined with a JIT (Java, Pypy, CPython 2.[56] with psyco all JIT compile byte code). 这提供了极大的可移植性和不错的速度，特别是如果与JIT结合使用（Java，Pypy，CPython 2 [56]与psyco一起使用所有JIT编译字节码）。

Confusingly, Java people often say their language is compiled and that Python is not compiled, and there was some discussion a while back of implementing a Java Runtime Environment in hardware, though I'm not sure it ever materialized. 令人困惑的是，Java人们经常说他们的语言是经过编译的，而Python没有经过编译，因此在硬件中实现Java运行时环境还有些讨论，尽管我不确定它是否曾经实现过。

Also, gcj compiles Java source code to machine readable executables, as does Cython - among others. 同样，gcj和Cython一样，将Java源代码编译为机器可读的可执行文件。 But Java and Python are both mostly byte-code interpreted. 但是Java和Python大多都是字节码解释的。

Python解析器/编译器与解释器，以及字符串串联编译时与运行时？

问题描述

3 个解决方案

解决方案1
6 已采纳 2013-12-26 21:43:09

解决方案2
2 2013-12-26 21:43:48

解决方案3
0 2013-12-26 23:13:07

Python解析器/编译器与解释器，以及字符串串联编译时与运行时？

问题描述

3 个解决方案

解决方案1 6 已采纳 2013-12-26 21:43:09

解决方案2 2 2013-12-26 21:43:48

解决方案3 0 2013-12-26 23:13:07

解决方案1
6 已采纳 2013-12-26 21:43:09

解决方案2
2 2013-12-26 21:43:48

解决方案3
0 2013-12-26 23:13:07