简体   繁体   中英

Confusion about variables in python

I am trying to implement heapsort but I am getting unexpected results. I think this is due to something I don't understand about how Python handles variables (I am talking about side effects). Here's the code:

from math import *

def parent(i):
    return floor((i+1)/2)-1

def left(i):
    return 2*i+1

def right(i):
    return 2*i+2

def maxheapify(A, i):
    l = left(i)
    r = right(i)
    if l < len(A) and A[i] < A[l]:
        largest = l
    else:
        largest = i
    if r < len(A) and A[largest] < A[r]:
        largest = r
    if largest != i:
        temp = A[i]
        A[i] = A[largest]
        A[largest] = temp
        maxheapify(A, largest) 


def buildmaxheap(A):
    for i in range(int(floor(len(A)/2)), -1, -1):
        maxheapify(A, i)

def heapsort(A):
    n = len(A)
    buildmaxheap(A)
    for k in range(len(A), 0, -1):  
        temp = A[0]
        A[0] = A[k-1]
        A[k-1] = temp
        C = A[0:k-1]
        maxheapify(C, 0)
        A = C + A[k-1:n]
    print(A)

Now when I run

A = [2, 4, 1, 3, 7, 5, 9]
heapsort(A)
print(A)

I obtain two printed lines (one from inside the heapsort showing that the sorting worked and one from the last print):

[1, 2, 3, 4, 5, 7, 9]
[1, 7, 5, 3, 4, 2, 9]

Obviously, I'd like them both to be the same (which would mean that the sorting actually worked and A is sorted after calling heapsort(A))

So what I don't get is:

  1. If A is correctly sorted (at the point of the last line in heapsort(A)), why doesn't this change persist after leaving the function block?

  2. If this is due to some permanence of the variable A, why isn't the end result the original value of A, but the intermediate step in heapsort, which is the result of the maxheapify call?

At the start of the function, the list A inside the function is the same as the list outside of the function, and any modifications made to one will be reflected in the other (it's a mutable object).

When you do an assignment to a list, you're substituting a new list object for the old list object. This breaks the connection to the outside object.

Instead of assigning a new list to A , you can assign to a slice of A and the original object will be modified in place instead.

A[:] = C + A[k-1:n]
A = C + A[k-1:n]

This is the line responsible for the behaviour you're seeing. By setting A equal to A[0:k-1] + A[k-1:n] you are making a copy of all of A's elements. If you want your changes to persist within the list you passed in you must assign the list to all the elements of A like so:

A[:] = C + A[k-1:n]

The following implementation shows a rewrite of your code but includes an alternate solution above the last call to the print function. The commented-out line may replace the line directly above it, or you may choose to return a at the end of the heap_sort function and rebind the value of a in your main function instead.

def main():
    a = [2, 4, 1, 3, 7, 5, 9]
    heap_sort(a)
    print(a)

parent = lambda i: (i + 1 >> 1) - 1
left = lambda i: (i << 1) + 1
right = lambda i: i + 1 << 1

def max_heapify(a, i, n):
    l = left(i)
    r = right(i)
    largest = l if l < n and a[i] < a[l] else i
    if r < n and a[largest] < a[r]:
        largest = r
    if largest != i:
        a[i], a[largest] = a[largest], a[i]
        max_heapify(a, largest, n)

def build_max_heap(a, n):
    for i in reversed(range(n + 2 >> 1)):
        max_heapify(a, i, n)

def heap_sort(a):
    n = len(a)
    build_max_heap(a, n)
    for k in reversed(range(n)):
        a[0], a[k] = a[k], a[0]
        c = a[:k]
        max_heapify(c, 0, k)
        a[:k] = c
        # the following would change "a" in this scope only
        # a = c + a[k:]
    # print(a)

if __name__ == '__main__':
    main()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM