Suppose I want to get a specific character of a string in Python 2.7, suppose
a = 'abcdefg...' # a long string
print a[5]
Wondering when access any specific character of a string, for example, access the 5th element, wondering what is the performance, is it constant time O(1), or linear performance O(n) either according the 5 (the position of the character we are accessing), or linear performance O(n) to the whole string (len(a) in this example)?
>>> long_string_1M ="".join(random.choice(string.printable) for _ in xrange(1000000))
>>> short_string = "hello"
>>> timeit.timeit(lambda:long_string_1M[50000])
0.1487280547441503
>>> timeit.timeit(lambda:short_string[4])
0.1368805315209798
>>> timeit.timeit(lambda:short_string[random.randint(0,4)])
1.7327393072888242
>>> timeit.timeit(lambda:long_string_1M[random.randint(50000,100000)])
1.779330312345877
looks like O(1) to me
they acheive it because a string is consecutive memory locations so indexing into it is simply a matter of offsetting ... there is no seek (at least that is my understanding) if you know c/c++ its something like *(pointer+offset)
(its been a long time since ive done C so that might be a little wrong)
In addition to Joran's answer, I'd point you to this reference implementation , confirming his answer that it is O(1) lookup
/* String slice a[i:j] consists of characters a[i] ... a[j-1] */
static PyObject *
string_slice(register PyStringObject *a, register Py_ssize_t i,
register Py_ssize_t j)
/* j -- may be negative! */
{
if (i < 0)
i = 0;
if (j < 0)
j = 0; /* Avoid signed/unsigned bug in next line */
if (j > Py_SIZE(a))
j = Py_SIZE(a);
if (i == 0 && j == Py_SIZE(a) && PyString_CheckExact(a)) {
/* It's the same as a */
Py_INCREF(a);
return (PyObject *)a;
}
if (j < i)
j = i;
return PyString_FromStringAndSize(a->ob_sval + i, j-i);
}
Why this should be your intuition
Python strings are immutable . This common optimization allows tricks like assuming contiguous data when desirable. Note that under the hood, we sometimes just need to compute the offset from the memory location in C (obviously implementation specific)
There are several places where the immutability of strings is something that can be relied on (or vexed by). In the python author's words;
There are several advantages [to strings being immutable]. One is performance: knowing that a string is immutable means we can allocate space for it at creation time
So although we may not be able to guarantee, as far as I know, this behaviour across implementations, it's awfully safe to assume.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.