What is the difference between dict[new_key] = [dict[key], new_value] and x = dict[key] + [new_value]?

Question

I was reading the Python Patterns essay on graph implementations, and came across the section that discusses the optimal implementation of finding the shortest path. The article is here .

I couldn't understand the difference between the two assignments to dist[next] ; the latter is quadratic in runtime, but the former is linear?

Linear runtime:

# Code by Eryk Kopczyński 
def find_shortest_path(graph, start, end):
    dist = {start: [start]}
    q = deque(start)
    while len(q):
        at = q.popleft()
        for next in graph[at]:
            if next not in dist:
                dist[next] = [dist[at], next]  # this line
                q.append(next)
    return dist.get(end)

Quadratic runtime:

# Code by Eryk Kopczyński 
def find_shortest_path(graph, start, end):
    dist = {start: [start]}
    q = deque(start)
    while len(q):
        at = q.popleft()
        for next in graph[at]:
            if next not in dist:
                dist[next] = dist[at] + [next]  # this line
                q.append(next)
    return dist.get(end)

Can someone please explain what the difference is between these two assignments?

Answer 1

[ ] bracket notation is used for both subscript de-reference and for constructing lists.

You asked about these expressions:

            dist[next] = [dist[at], next]

            dist[next] = dist[at] + [next]

Let's ignore the LHS left hand side of the assignment, as it is simply storing a dict mapping. Rather than dist[next] =... it could as easily have been x =... followed by dist[next] = x .

Similarly, in the RHS let's ignore the dist[at] . We easily could have set a temp var beforehand: y = dist[at] , and then use y in the expression.

Ok, with square bracket notation out of the way, let's focus on the algorithmic complexity of constructing [y, next] and y + [next] expressions, where y is a list.

We can compute the first expression in O(1) constant time. We simply allocate a short list and fill its first two elements with pointers to the y and to the next objects.

The second list expression, y + [next] , requires scanning each element of y and copying it into a new temporary result, then finally appending a next pointer to that result. This has O(n) linear cost, if y is of length n .

Explanation

Why was that extra work necessary? Well suppose an element of y is subsequently mutated, changed to a new value. The first expression is fine with that, it will reflect the new value, since the programmer asked to store a pointer in the list, a reference to what y was pointing at.

OTOH the second expression won't preserve that relationship, it is looking just at the values on either side of the + plus sign. Imagine that we store that "big list" result and then mutate y . There will be no effect on the big list, which is as it should be.

What is the difference between dict[new_key] = [dict[key], new_value] and x = dict[key] + [new_value]?

Question

1 answers

solution1
1 ACCPTED 2022-05-20 17:52:09

What is the difference between dict[new_key] = [dict[key], new_value] and x = dict[key] + [new_value]?

Question

1 answers

solution1 1 ACCPTED 2022-05-20 17:52:09

solution1
1 ACCPTED 2022-05-20 17:52:09