简体   繁体   中英

Why Pandas allows editing DataFrames but not Series objects

Suppose I have a dataset like below : 在此处输入图片说明

When I try to overwrite a specific column (Series object), I get the error with the following code :

mask = bond["Actor"] == "Sean Connery"
bond[mask]["Actor"] = "Sir Sean Connery"

But the moment I move one level down and instead edit all the columns of those rows (complete DataFrame), I succeed

mask = bond["Actor"] == "Sean Connery"
bond[mask] = "Sir Sean Connery"

Why is that so? In the first case, I thought that its not logical to edit a copy and hence the error. But the same should be applicable in the latter case also, as the second example should also return a copy of the original DataFrame.

There is problem you need loc for avoid chained indexing :

bond = pd.DataFrame({'Actor':list('abcaef'),
                   'A':list('efghij'),
                   'B':list('aaabbb')})

print (bond)
   A Actor  B
0  e     a  a
1  f     b  a
2  g     c  a
3  h     a  b
4  i     e  b
5  j     f  b

mask = bond["Actor"] == "a"
bond.loc[mask] = "AAA"
#for select all columns :, for columns can be omitted
#bond.loc[mask,:] = "AAA"

print (bond)
     A Actor    B
0  AAA   AAA  AAA
1    f     b    a
2    g     c    a
3  AAA   AAA  AAA
4    i     e    b
5    j     f    b

#one column Actor
bond.loc[mask, "Actor"] = "AAA"
print (bond)
   A Actor  B
0  e   AAA  a
1  f     b  a
2  g     c  a
3  h   AAA  b
4  i     e  b
5  j     f  b

Consider the following single column DataFrame:

df = pd.DataFrame({'Actor': ['Sean Connery', 'Sean Connery', 
                             'Sean Something', 'Sean Something Else']})

df
Out: 
                 Actor
0         Sean Connery
1         Sean Connery
2       Sean Something
3  Sean Something Else

And this is the mask that you want to use for slicing:

mask = df['Actor'] == 'Sean Connery'

Now, if I use df[mask]['Actor'] = 'Sir Sean Connery' , this will be executed:

df.__getitem__(mask).__setitem__('Actor', 'Sir Sean Connery')
 __main__:1: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead 

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy

And for this case it will not modify the original DataFrame:

df
Out: 
                 Actor
0         Sean Connery
1         Sean Connery
2       Sean Something
3  Sean Something Else

Id did modify a DataFrame though - which was returned by the __getitem__ method but since it was not assigned to anything, it is lost.

Instead, in your second example ( df[mask] = 'Sir Sean Connery' ) the code executed is:

df.__setitem__(mask, 'Sir Sean Connery')

Because of the mask you probably think it uses __getitem__ too but it does not. It directly uses __setitem__ and passes the mask to that DataFrame. And pandas ensures us that with __setitem__ we can be sure that it will operate on a view. For the case of __getitem__ it says it can be o copy or it can be a view - hard to know.

Now you'll see that the original df is modified:

df
Out: 
                 Actor
0     Sir Sean Connery
1     Sir Sean Connery
2       Sean Something
3  Sean Something Else

There is one catch though. It worked because we only had one column. If we had another column, say 'Year', it would set the corresponding Year values to 'Sir Sean Connery' too. In order to avoid that, we use .loc as jezrael pointed out. It also calls the __setitem__ method and allows specifying which columns will change.

df = pd.DataFrame({'Actor': ['Sean Connery', 'Sean Connery', 
                             'Sean Something', 'Sean Something Else'],
                  'Year': [1990, 1990, 1990, 1990]})


df.loc.__setitem__((mask, 'Actor'), 'Sir Sean Connery')

df
Out: 
                 Actor  Year
0     Sir Sean Connery  1990
1     Sir Sean Connery  1990
2       Sean Something  1990
3  Sean Something Else  1990

As a result, best practice to set based on a mask and column name(s) is to use .loc :

df.loc[mask, 'Actor'] = 'Sir Sean Connery'

This way you don't have to worry if you are operating on a copy.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM