简体   繁体   中英

How python interns strings in interactive interpreter vs jupyter notebook

I am trying to understand when python interns constants and when it doesn't. I'm using python 3.8.5 for this question. I understand that after python 3.7 python changed from peephole optimization to the AST optimizer and that the longer strings are now interned.

I thought I had this all under control until I tried running the same commands - in the same conda enviornment and with the same version of python -- in a jupyter notebook and in the interactive interpreter.

>>> sys.version
'3.8.5 (default, Sep  4 2020, 02:22:02) \n[Clang 10.0.0 ]'

>>> "AvocadoAvocadoAvocadoAvocadoAvocado !" is "AvocadoAvocadoAvocadoAvocadoAvocado !"
<stdin>:1: SyntaxWarning: "is" with a literal. Did you mean "=="?
False

And in Jupyter Notebooks

import sys
sys.version

gives

'3.8.5 (default, Sep  4 2020, 02:22:02) \n[Clang 10.0.0 ]'
"AvocadoAvocadoAvocadoAvocadoAvocado !" is "AvocadoAvocadoAvocadoAvocadoAvocado !"

gives

<>:1: SyntaxWarning: "is" with a literal. Did you mean "=="?
<>:1: SyntaxWarning: "is" with a literal. Did you mean "=="?
<ipython-input-6-2414f185945a>:1: SyntaxWarning: "is" with a literal. Did you mean "=="?
  "AvocadoAvocadoAvocadoAvocadoAvocado !" is "AvocadoAvocadoAvocadoAvocadoAvocado !"
True

I can't figure out why the result is False in the interpreter and True in the notebook. I also wonder why there are three warnings in the notebook and only one in the interpreter and whether that holds any clues to why the results are different.

Why do I get False in the interactive interpreter and a True in Jupyter Notebooks?

There are many things in python (and other languages) which may seem like they work, but go against the definition of how they're supposed to work. Object identity is one of those things. The purpose of the is keyword is never to compare the value of something, but to test if two variables refer to the same underlying object. While it may seem to make sense that if they're the same object then the value must also be equal, but that statement is not true at all in reverse. This will sometimes work (as you have found) without throwing an exception, however it is not a defined feature of python. These are things which are "implementation dependent", and are never guaranteed to give correct or even stable results.

Apparently ipython does not submit chunks of code to the cpython binary in the same way it is submitted via the built-in REPL: https://github.com/satwikkansal/wtfpython/issues/100#issuecomment-549171287

I would assume this is to reduce the number of messages the front-end has to send to the kernel when sending multiple lines of code. I would expect the behavior of executing a .py file from the command line would better match the results you get from ipython in this regard.

Along these lines, it is sometimes possible to recover objects after deletion but before garbage collection because the implementation of the id function returns the memory location of the object which can be used with ctypes to construct a new PyObject . This is very much a way to introduce bugs and instability into your code. If for some reason id was switched out to a simple counter for each allocated item, (perhaps you want to protect against leaking any information about the process memory space) this would immediately break.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM