python: dictionary and numpy.array issue

Question

I have a dictionary with arrays (same length) associated to strings. My goal is to create a new dictionary with the same keys but cutting the arrays, keeping only the elements I need. I wrote a function to do it but the problem is that it returns a dictionary with the same array (correct cut length) associated to every key, while the print command i put to control show the correct association. Here's the function:

def extract_years(dic,initial_year,final_year):

    dic_extr = {}
    l = numpy.size(dic[dic.keys()[0]])

    if final_year != 2013 : 
        a = numpy.zeros((final_year - initial_year)*251)
    elif final_year == 2013 :
        a = numpy.zeros(l - (initial_year-1998)*251)

    for i in range(0,len(dic)):
        #print i
        for k in range (0,numpy.size(a)):
            a[k] = dic[dic.keys()[i]][(initial_year-1998)*251 + k]          
            #print k

        dic_extr[dic.keys()[i]] = a
        print dic.keys()[i]
        print dic_extr[dic.keys()[i]]


    print dic_extr.keys()
    print dic_extr
    return dic_extr

as I said, print dic_extr[dic.keys()[i]] shows the correct results while the final print dic_extr shows a dictionary with the same array associated to every key.

Answer 1

In Python, every object is a pointer. So, you should have to create a new instance of a for each iteration of the outer for loop. You could do this, for example, initializing the a array inside of that loop, like this:

def extract_years(dic,initial_year,final_year):

    dic_extr = {}
    l = numpy.size(dic[dic.keys()[0]])

    for i in range(0,len(dic)):

        if final_year != 2013 : 
            a = numpy.zeros((final_year - initial_year)*251)
        elif final_year == 2013 :
            a = numpy.zeros(l - (initial_year-1998)*251)

        for k in range (0,numpy.size(a)):
            a[k] = dic[dic.keys()[i]][(initial_year-1998)*251 + k]          
            #print k

        dic_extr[dic.keys()[i]] = a
        print dic.keys()[i]
        print dic_extr[dic.keys()[i]]


    print dic_extr.keys()
    print dic_extr
    return dic_extr

Perhaps this is not the most elegant solution, but I think that it should work.

Answer 2

I think you ran into the typical problem of mutability and pythons way to define variables:

You define a to be mutable type by using numpy.zeros() .
Then you make a have a certain values in it, but you actually have a pointer to a list of pointers, pointing to the actuall values.
By using dic_extr[dic.keys()[i]] = a you copy this pointer into the dic_extr array, not the list of pointers.
Then you change the objects that the pointer list refers to.
By using dic_extr[dic.keys()[i]] = a you copy the pointer to the list of pointers into the dic_extr array, again not the pointer list itself.

In the end both pointer point to the same pointer list. Easy example:

a = [1, 2, 3, 4, 5]
b = a
b[0] = 10
print(a) # returns [10, 2, 3, 4, 5]

You can use dic_extr[dic.keys()[i]] = a[:] to actually make a copy of a.

Here is also a nice explaination to mutability in python.

python: dictionary and numpy.array issue

Question

2 answers

solution1
1 ACCPTED 2017-08-30 10:39:27

solution2
0 2017-08-30 10:39:35

python: dictionary and numpy.array issue

Question

2 answers

solution1 1 ACCPTED 2017-08-30 10:39:27

solution2 0 2017-08-30 10:39:35

solution1
1 ACCPTED 2017-08-30 10:39:27

solution2
0 2017-08-30 10:39:35