繁体   English   中英

Pandas copy 正在更改原始数据帧,即使使用 copy(deep=True)

[英]Pandas copy is changing the original dataframe, even with copy(deep=True)

我有以下代码试图从 1 个数据帧创建 2 个单独的表。 这些表应用了不同的过滤器。 我发现的是,一旦应用了第一个过滤器,原始数据框就会“改变”。

df_orig = pd.read_excel('JRMaster.xlsm')
df_orig.columns = map(str.upper, df_orig.columns)
df_orig['SYSTEM'] = df_orig['SYSTEM'].str.upper()
df_orig['STATUS'] = df_orig['STATUS'].str.upper()

df = df_orig.copy(deep=True)
df_copy_all = df_orig.copy(deep=True)

df = df[(df['DATE PAID'].dt.month.between(10,10)) & (df['DATE PAID'].dt.year == 2020)]
df2 = df_copy_all[(df_copy_all['DATE SENT'].dt.month.between(10,10)) & (df['DATE SENT'].dt.year == 2020)]

df 和 df2 应该有 2 个不同的结果,但输出是相同的。 我试过 df.copy() 和 df.copy(deep=True)

使用 Pandas 1.0.5 和 Python 3.6

一些论坛指出这是一个错误,但我想检查是否有解决方法或修复此问题。

我想到的另一种方法是将原始 excel 文档读入多个数据帧,但这似乎不可持续且资源繁重。

编辑:

示例数据如下:

System  DATE SENT   STATUS  DATE PAID
0   One 2020-10-01  OPEN    NaT
1   One 2020-10-01  OPEN    NaT
2   THREE   2020-10-01  SR  2020-10-07
3   One 2020-10-01  DUP NaT
4   One 2020-10-01  OPEN    NaT
5   One 2020-10-01  OPEN    NaT
6   THREE   2020-10-01  OPEN    NaT
7   One 2020-10-01  DUP NaT
8   THREE   2020-10-01  AR  2020-07-31
9   THREE   2020-10-01  OPEN    NaT
10  One 2020-10-01  AR  2020-08-21
11  One 2020-10-01  DUP NaT
12  One 2020-10-01  OPEN    NaT
13  One 2020-10-01  DUP NaT
14  One 2020-10-01  DUP NaT
15  One 2020-10-01  DUP NaT
16  One 2020-10-01  DUP NaT
17  THREE   2020-10-01  OPEN    NaT
18  One 2020-10-01  OPEN    NaT
19  One 2020-10-01  OPEN    NaT

看起来deepcopy不适用于pandas

请参阅Deep copying in Pandas

问题实际上是一个错字:

df2 = df_copy_all[(df_copy_all['DATE SENT'].dt.month.between(10,10)) & (df['DATE SENT'].dt.year == 2020)]

应该

df2 = df_copy_all[(df_copy_all['DATE SENT'].dt.month.between(10,10)) & (df2['DATE SENT'].dt.year == 2020)]

错误在:df2['DATE SENT'],我有 df['DATE SENT']

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM