Pandas 分配意外行为

Question

I am just playing with pandas, trying to modify values of a column.我只是在玩 pandas，试图修改列的值。

My initial dataframe is:我最初的 dataframe 是：

df = pd.DataFrame(
    dict(x=[1, 2, 3, 4, 5, 6, 7], y=[10, 11, 15, 14, 14, 25, 25)
    )
df.index = list('abcdefg')

with output:与 output：

Suppose that I want to modify the first element of x column.假设我要修改x列的第一个元素。 I do:我愿意：

df.loc['a', 'x'] = 100

which outputs:输出：

>>> df.loc['a', 'x'] = 100
>>> df
     x   y
a  100  10
b    2  11
c    3  15
d    4  14
e    5  14
f    6  25
g    7  25

What I can't understand is why the following:我无法理解的是为什么会出现以下情况：

>>> j = df['x']
>>> j['a'] = 200
>>> df
     x   y
a  200  10
b    2  11
c    3  15
d    4  14
e    5  14
f    6  25
g    7  25

also modifies the first element of x column in df .还修改df中x列的第一个元素。 Furthermore:此外：

>>> df.loc['a', 'x'] is j['a']
False

which means that they don't point to the same object.这意味着它们不指向同一个 object。 What is going on?到底是怎么回事？

Answer 1

You are not performing the correct test.您没有执行正确的测试。 You should rather test:你应该测试：

j is df['x']

output: True output： True

j and df['x'] point to the same Series. j和df['x']指向同一个系列。

The False is explained by the underlying numpy array that does not contain python objects. False由不包含 python 对象的底层 numpy 数组解释。 The object are generated during slicing: object 在切片期间生成：

import numpy as np

a = np.array([1, 2, 3])
a[0] is a[0]

output: False output： False

Answer 2

That is why we need copy here这就是为什么我们需要在这里copy

j = df['x'].copy()

Notice after add copy the id number is different注意添加副本后id号不同

id(df['x'])
Out[612]: 140536670316496
id(df['x'].copy())
Out[613]: 140536673228496

Answer 3

Because j = df['x'] only assigns the object reference of df['x'] to j .因为j = df['x']仅将df['x'] ['x'] 的 object 引用分配给j 。 Anything modified through j will impact the same object in memory as the one behind df['x'] .通过j修改的任何内容都会影响 memory 中与df['x']后面的相同的 object 。

If you want to make a copy of the object value behind df['x'] , store it into j , and modify it independently from df , you need to use .copy() method as:如果要复制df['x']后面的 object 值，将其存储到j中，并独立于df进行修改，则需要使用.copy()方法：

j = df['x'].copy()

Read more about it at https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.copy.html在https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.copy.html了解更多信息

Pandas 分配意外行为

问题描述

3 个解决方案

解决方案1
2 2022-07-31 17:02:12

解决方案2
0 2022-07-31 17:28:33

解决方案3
0 2022-07-31 17:51:15

Pandas 分配意外行为

问题描述

3 个解决方案

解决方案1 2 2022-07-31 17:02:12

解决方案2 0 2022-07-31 17:28:33

解决方案3 0 2022-07-31 17:51:15

解决方案1
2 2022-07-31 17:02:12

解决方案2
0 2022-07-31 17:28:33

解决方案3
0 2022-07-31 17:51:15