简体   繁体   English

python pandas 0.16:SettingWithCopyWarning错误地报告

[英]python pandas 0.16: SettingWithCopyWarning incorrectly reported

As per my other question: Python Anaconda: how to test if updated libraries are compatible with my existing code? 根据我的另一个问题: Python Anaconda:如何测试更新的库是否与我现有的代码兼容?

I curse the day I was forced to upgrade to pandas 0.16. 我诅咒那天我被迫升级到熊猫0.16。 One of the things I don't understand is why I get a chained assignment warning when I do something as banal as adding a new field to an existing dataframe and initialising it with 1: 我不明白的一件事是,为什么在做一些平庸的事情时会收到链式分配警告,例如在现有数据帧中添加新字段并用1初始化它:

mydataframe['x']=1

causes the following warning: 导致以下警告:

SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. SettingWithCopyWarning:试图在DataFrame的切片副本上设置一个值。 Try using .loc[row_indexer,col_indexer] = value instead 尝试改用.loc [row_indexer,col_indexer] = value

See the the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy mydataframe['x']=1 请参阅文档中的警告: http ://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy mydataframe ['x'] = 1

I understand there can be problems when assigning values to a copy of a dataframe, but here I am just adding a new field to a dataframe! 我知道将值分配给数据框的副本时可能会出现问题,但是在这里我只是向数据框添加一个新字段! How am I supposed to change my code (which worked perfectly in previous versions of pandas)? 我应该如何更改我的代码(在以前的熊猫版本中效果很好)?

Here's an attempt at an answer, or at least an attempt to reproduce the message. 这是对答案的尝试,或者至少是再现消息的尝试。 (Note that you may only get this message once and might need to start a new shell or do %reset in ipython to get this message.) (请注意,您可能只会收到此消息一次,并且可能需要启动新的Shell或在ipython中执行%reset才能获得此消息。)

In [1]: %reset

Once deleted, variables cannot be recovered. Proceed (y/[n])? y

In [2]: import pandas as pd

In [3]: pd.__version__
Out[3]: '0.16.0'

Here are 3 variations of setting a new column to '1'. 这是将新列设置为“ 1”的3种变化。 The first two do not generate the warning, but the third one does. 前两个不会生成警告,但是第三个会生成警告。 (Second one thanks to @Jeff's suggestion) (第二个感谢@Jeff的建议)

In [4]: df = pd.DataFrame({ 'x':[1,2,3], 'y':[77,88,99] })
   ...: df['z'] = 1

In [5]: df = pd.DataFrame({ 'x':[1,2,3], 'y':[77,88,99] })
   ...: df = df[1:]
   ...: df['z'] = 1

In [6]: df = pd.DataFrame({ 'x':[1,2,3], 'y':[77,88,99] })
   ...: df2 = df[1:]
   ...: df2['z'] = 1

-c:3: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable
/indexing.html#indexing-view-versus-copy

Perhaps others can correct me if I'm wrong, but I believe the error message here is relating to df2 being a copy of a slice of df . 如果我错了,也许其他人可以纠正我,但我相信这里的错误消息与df2df切片的副本有关。 However, that's not really an issue as the resulting df and df2 are what I would have expected: 但是,这并不是真正的问题,因为生成的dfdf2是我所期望的:

In [7]: df
Out[7]: 
   x   y
0  1  77
1  2  88
2  3  99

In [8]: df2
Out[8]: 
   x   y  z
1  2  88  1
2  3  99  1

I know this is going to be terrible to say, but when I get that message I just check to see whether the command did what I wanted or not and don't overly think about the warning. 我知道这很难说,但是当我收到该消息时,我只是检查命令是否执行了我想要的操作,并且不要过分考虑警告。 But whether you get a warning message or not, checking that a command did what you expected is really something you need to do all the time in pandas (or matlab, or R, or SAS, or Stata, ... ) 但是,不管您是否收到警告消息,检查命令是否确实达到了您的期望,这确实是您在熊猫(或matlab,R或SAS或Stata等)中始终需要做的事情。

This will not generate the warning: 这不会生成警告:

df = pd.DataFrame({ 'x':[1,2,3], 'y':[77,88,99] })
df2 = df[1:].copy()
df2['z'] = 1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM