简体   繁体   English

RAM中的Python字符串

[英]Python strings in RAM

How are string variables saved in the RAM? 字符串变量如何保存在RAM中?

for example: foo = 'abcǶ' and s = u'abc\ℙ' . 例如: foo = 'abcǶ's = u'abc\ℙ'

The strings contains UNICODE characters, so should it be and how is it encoded before stored in memory? 字符串包含UNICODE字符,应该这样吗?在存储到内存之前如何编码?

I'm assuming you're asking about CPython, the standard Python implementation. 我假设您要问的是标准Python实现CPython。

The Unicode string representation was changed beginning with Python 3.3 as described in PEP 0393 . PEP 0393中所述,从Python 3.3开始更改了Unicode字符串表示形式。 Since then, strings use the same number of bytes for all characters, either 1, 2 or 4, choosing the smallest possible for each string depending on its contents. 从那时起,字符串对所有字符(1、2或4)使用相同数量的字节,请根据其内容为每个字符串选择尽可能小的字节。 The specific encodings used are: 使用的特定编码为:

  • 1 byte per char: Latin-1 每个字符1个字节:拉丁文1
  • 2 bytes per char: UCS-2 每个字符2个字节:UCS-2
  • 4 bytes per char: UCS-4 每个字符4个字节:UCS-4

Before version 3.3, the Unicode string representation depended on the system, and was usually either UTF-16, UCS-4 or UCS-2, to the best of my understanding. 根据我的理解,在版本3.3之前,Unicode字符串表示取决于系统,通常为UTF-16,UCS-4或UCS-2。 See the above-mentioned PEP 0393 and its references for more details. 有关更多详细信息,请参见上述PEP 0393及其参考。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM