How do I pass a reference to a pandas series into a dictionary?

Question

So I need to assemble a dictionary of pandas series' and I was wondering if it would be faster to just pass a reference to the series instead of copying over all the data into the dictionary. I have the code:

df = pd.read_csv('data.csv')
dict = {
    'Start' : df['Start']
        }
print(dict.get('Start'))

I tried to change the data to see if it was copying over the data so I did

dict = {
    'Start' : df['Start']
        }
df['Start'] = df['End']
print(dict.get('Start'))

but this didn't change the output of the code at all, showing that the dictionary contains a copy of the series. I think this would be slower than just passing a reference so is it possible for me to just assign a reference to the value inside the dict?

Answer 1

 df['Start'] = df['End']

Is not a reliable way to test this. Basically, pandas makes no guarantees (or not a lot) about underlying buffer that represents the data in the dataframe. All of this relies on implementation details, the block manager will try to keep things stored efficiently in like-typed blocks, which is possible if the dtypes are homogenous, but in the case of hetergenous dtypes

df['Start'] = df['End']

Could potentially re-arrange the way the dataframe is represented.

A more reliable way to test the copying behavior is to modify a single element without changing the type of the column. So assuming "Start" is all integers:

>>> df = pd.DataFrame({"start":[1,2,3], "end":[4,5,6]})
>>> df
   start  end
0      1    4
1      2    5
2      3    6
>>> d = {'start':df['start']}
>>> df.loc[0, 'start'] = 99
>>> d
{'start': 0    99
1     2
2     3
Name: start, dtype: int64}

But I'm not sure about any guarantees that pandas makes about df[column] , but in my experience, it always returns a view. However, it is a view of the underlying data in the block manger at that time . Mutating your dataframe can easily change that underlying buffer.

How do I pass a reference to a pandas series into a dictionary?

Question

1 answers

solution1
0 2021-02-16 19:49:02

How do I pass a reference to a pandas series into a dictionary?

Question

1 answers

solution1 0 2021-02-16 19:49:02

solution1
0 2021-02-16 19:49:02