简体   繁体   中英

Using dictionary keys in pandas dataframe columns

I wrote the following code in which I create a dictionary of pandas dataframes:

import pandas as pd
import numpy as np

classification = pd.read_csv('classification.csv')

thresholdRange = np.arange(0, 70, 0.5).tolist()

classificationDict = {}

for t in thresholdRange:
    classificationDict[t] = classification

for k, v in classificationDict.iteritems():
    v ['Threshold'] = k

In this case, I want to create a column called 'Threshold' in all the pandas dataframes in which the keys of the dictionary are the values. However, what I get with the code above is the same value in all dataframes. What am I missing here? Perhaps I am complicating things for myself with this approach, but I'd greatly appreciate your help.

Sorry, I got your question wrong. Now this is the issue:

Obviously, classification (a pandas dataframe, I suppose) is a mutable object, and adding a mutable object to a list or a dict makes strange (for python-beginners) behaviour. The same object is added. If you change one of the list entries, all get changed. Try this:

a = [1]
b = [a, a]
b[0] = 2
print(b[1])

This is what happens to your dict. You have to add different objects to the dict. Probably the dataframe has a .copy() -method to do this. Alternatively, I found this post for you, with (in essence) the same problem, there are further solutions there:
https://stackoverflow.com/a/2612815/6053327

Of course you get the same value. You are doing the same assignment over and over again in

for k, v in classificationDict.iteritems():

because your v s are all identical, you assigned them in the first for
Did you try debugging yourself, and print classification ? I assume that it is only the first line?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM