简体   繁体   中英

Updating values of pandas dataframe on condition

I'm trying to update the pandas data frame by logical condition but, it fails with below error,

df[df.b <= 0]['b'] = 0

A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead

How do I get this working?

Data:

df = pd.DataFrame({'a': np.random.randn(4), 'b': np.random.randn(4)})
    a           b
0   1.462028    -1.337630
1   0.206193    -1.060710
2   -0.464847   -1.881426
3   0.290627    0.650805

I am learning pandas. In R, syntax is like below,

df[df$b <= 0]$b <- 0

Use

df.loc[df.b <= 0, 'b']= 0

For efficiency pandas just creates a references from the previous DataFrame instead of creating new DataFrame every time a filter is applied.
Thus when you assign a value to DataFrame it needs tobe updated in the source DataFrame (not just the current slice of it). This is what is refered in the warning

A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead

To avoid this .loc syntax is used.

For more information on DataFrame indexing

Try this:

>>> df.ix[df['b']<=0] = 0
>>> df
      a         b
0  0.000000  0.000000
1  0.000000  0.000000
2  0.212535  0.491969
3  0.000000  0.000000

Note: Since v0.20 ix has been deprecated. Use loc or iloc instead.

Follow below pattern for updating the value -

food_reviews_df.loc[food_reviews_df.Score <= 3, 'Score'] = 0
food_reviews_df.loc[food_reviews_df.Score >= 4, 'Score'] = 1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM