简体   繁体   English

python字符串索引访问的时间复杂度?

[英]Time complexity of python string index access?

If I'm not mistaken, a Python string is stored in unicode scalars. 如果我没弄错的话,Python字符串存储在unicode标量中。 However, unicode scalars can combine to form other grapheme clusters. 但是,unicode标量可以组合形成其他字形簇。 Therefore, using memory displacement start + scalarSize * n for string[n] isn't the answer you're looking for. 因此,对于string[n]使用内存位移start + scalarSize * n不是您正在寻找的答案。

Does this mean that Python iterates linearly through each scalar to get to the scalar you are looking for? 这是否意味着Python在每个标量中线性迭代以获得您正在寻找的标量? If you have 如果你有

word = 'caf' + char(65) + char(301) #café

Does Python store this as five scalars and iteratively check if any should be combined before moving on or does it run a check upon insertion and store 'pure' scalars? Python是否将其存储为五个标量,并在继续之前迭代检查是否应该组合任何标量,还是在插入时运行检查并存储“纯”标量?

Edit: I was confusing Python with another language. 编辑:我用另一种语言混淆了Python。 Python's print() prints out grapheme clusters but Python's str stores scalars no matter how you input them. Python的print()打印出字形集群,但Python的str存储标量,无论你如何输入它们。 So two combined scalars will print as one grapheme cluster which could be the same cluster as another scalar. 因此,两个组合标量将打印为一个字形集群,该集群可能与另一个标量集群相同。 When you go to call string[0] you'd get the scalar as inserted into the string. 当你去调用string[0]你会得到插入字符串的标量。

Python string indexing does not consider grapheme clusters. Python字符串索引不考虑字形集群。 It works by Unicode code points. 它适用于Unicode代码点。 I don't think Python actually has anything built-in for working with grapheme clusters. 我不认为Python实际上有任何内置的用于处理字形集群。

String indexing takes constant time, but if you want to retrieve the nth grapheme cluster, string indexing won't do that for you. 字符串索引需要恒定的时间,但是如果要检索第n个字形集群,则字符串索引不会为您执行此操作。

(People sometimes suggest applying canonical composition to the string, but there are plenty of possible grapheme clusters that still take multiple code points after canonical composition.) (人们有时建议在字符串中应用规范组合,但是在规范组合之后仍有许多可能的字形集群仍需要多个代码点。)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM