Python Pandas indexing

Question

Sorry if this is a simple question, I've tried to look for a solution but can't find anything.

My code goes like this:

given zip1, create an index to select observations (other zipcodes) where some calculation has not been done yet (666)
```
 I = (df['zip1'] == zip1) & (df['Distances'] == 666) 
```
perform some calculation
```
 distances = calc(zip1,df['zip2'][I]) 
```

So far so good, I've checked the distances variable, correct values, correct sized array.

put the distance variable in the right place
```
 df['Distances'][I] = distances 
```

but this last part updates all the df['Distances'] variables to nonsense values FOR ALL observations with df['zip1']=zip1 instead of the ones selected by I .

I've checked the boolean array I before the df['Distances'][I] = distances command and it looks fine. Any ideas would be greatly appreciated.

Answer 1

What you are attempting is called chained assignment and does not work the way you think as it returns a copy rather than a view hence the error you see.

There is more information about it here and related issues , this and this .

So you should either use .loc or .ix like so:

df.loc[I,'Distances']=distances

Python Pandas indexing

Question

1 answers

solution1
0 ACCPTED 2013-10-30 20:46:54

Python Pandas indexing

Question

1 answers

solution1 0 ACCPTED 2013-10-30 20:46:54

solution1
0 ACCPTED 2013-10-30 20:46:54