简体   繁体   中英

SettingWithCopyWarning even when using .loc[row_indexer,col_indexer] = value

This is one of the lines in my code where I get the SettingWithCopyWarning :

value1['Total Population']=value1['Total Population'].replace(to_replace='*', value=4)

Which I then changed to:

row_index= value1['Total Population']=='*'
value1.loc[row_index,'Total Population'] = 4

This still gives the same warning. How do I get rid of it?

Also, I get the same warning for a convert_objects(convert_numeric=True) function that I've used, is there any way to avoid that.

 value1['Total Population'] = value1['Total Population'].astype(str).convert_objects(convert_numeric=True)

This is the warning message that I get:

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy 

If you use .loc[row,column] and still get the same error, it's probably because of copying another data frame. You have to use .copy() .

This is a step by step error reproduction:

import pandas as pd

d = {'col1': [1, 2, 3, 4], 'col2': [3, 4, 5, 6]}
df = pd.DataFrame(data=d)
df
#   col1    col2
#0  1   3
#1  2   4
#2  3   5
#3  4   6

Creating a new column and updating its value:

df['new_column'] = None
df.loc[0, 'new_column'] = 100
df
#   col1    col2    new_column
#0  1   3   100
#1  2   4   None
#2  3   5   None
#3  4   6   None

No error I receive. However, let's create another data frame given the previous one:

new_df = df.loc[df.col1>2]
new_df
#col1   col2    new_column
#2  3   5   None
#3  4   6   None

Now, using .loc , I will try to replace some values in the same manner:

new_df.loc[2, 'new_column'] = 100

However, I got this hateful warning again:

A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

SOLUTION

use .copy() while creating the new data frame will solve the warning:

new_df_copy = df.loc[df.col1>2].copy()
new_df_copy.loc[2, 'new_column'] = 100

Now, you won't receive any warnings!

If your data frame is created using a filter on top of another data frame, always use .copy() .

您是否尝试过直接设置?:

value1.loc[value1['Total Population'] == '*', 'Total Population'] = 4

I came here because I wanted to conditionally set the value of a new column based on the value in another column.

What worked for me was numpy.where:

import numpy as np
import pandas as pd
...

df['Size'] = np.where((df.value > 10), "Greater than 10", df.value)

From numpy docs , this is equivelant to:

[xv if c else yv
 for c, xv, yv in zip(condition, x, y)]

Which is a pretty nice use of zip...

I have no idea how bad the data storage/memory implications are with this but it fixes it every time for your average dataframe:

def addCrazyColFunc(df):
    dfNew = df.copy()
    dfNew['newCol'] = 'crazy'
    return dfNew

Just like the message says... make a copy and you're good to go. Please if someone can fix the above without the copy, please comment. All the above loc stuff doesn't work for this case.

Try adding the following before the line where the warning is raised (:reindexing if necessary). It has the same effect as df.copy() , so there will be no warning.

 df = df.reset_index(drop=True) 

Got the solution:

I created a new DataFrame and stored the value of only the columns that I needed to work on, it gives me no errors now!

Strange, but worked.

Specifying it is a copy worked for me. I just added .copy() at the end of the statement

value1['Total Population'] = value1['Total Population'].replace(to_replace='*', value=4).copy()

This should fix your problem:

value1[:, 'Total Population'] = value1[:, 'Total Population'].replace(to_replace='*', value=4)

I was able to avoid the same warning message with syntax like this:

value1.loc[:, 'Total Population'].replace('*', 4)

Note that the dataframe doesn't need to be re-assigned to itself, ie value1['Total Population']=value1['Total Population']...

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM