简体   繁体   中英

Python strings in RAM

How are string variables saved in the RAM?

for example: foo = 'abcǶ' and s = u'abc\ℙ' .

The strings contains UNICODE characters, so should it be and how is it encoded before stored in memory?

I'm assuming you're asking about CPython, the standard Python implementation.

The Unicode string representation was changed beginning with Python 3.3 as described in PEP 0393 . Since then, strings use the same number of bytes for all characters, either 1, 2 or 4, choosing the smallest possible for each string depending on its contents. The specific encodings used are:

  • 1 byte per char: Latin-1
  • 2 bytes per char: UCS-2
  • 4 bytes per char: UCS-4

Before version 3.3, the Unicode string representation depended on the system, and was usually either UTF-16, UCS-4 or UCS-2, to the best of my understanding. See the above-mentioned PEP 0393 and its references for more details.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM