简体   繁体   中英

Create dataframe row with positive numbers and other with negative

I have the following dataframe called Utilidad

Argentina Bolivia   Chile   España  Uruguay
2004       3     6       1        3       2
2005       5     1       4        1       5

And I calculate the difference between 2004 and 2005 using

Utilidad.ix['resta']=Utilidad.ix[2005]-Utilidad.ix[2004]

Now I'm trying to create two additional rows, one with the result of the difference when is positive and the other one with the negatives, something like this

Argentina Bolivia   Chile   España  Uruguay
2004       3     6       1        3       2
2005       5     1       4        1       5
resta      2    -5       3       -2       3
positive   2     0       3        0       3
negative   0    -5       0       -2       0

The only I have managed to do is to have an additional column which tells me wheter "resta" is positive or not, using

Utilidad.ix['boleano'][Utilidad.ix['resta']>0]

Can someone help me to create this two additional rows?

Thanks

You can use numpy.where

df.ix['positive'] = np.where(df.ix['resta'] > 0, df.ix['resta'], 0)
df.ix['negative'] = np.where(df.ix['resta'] < 0, df.ix['resta'], 0)

numpy.clip will be handy here, or just calculate it .

In [35]:

Utilidad.ix['positive']=np.clip(Utilidad.ix['resta'], 0, np.inf)
Utilidad.ix['negative']=np.clip(Utilidad.ix['resta'], -np.inf, 0)
#or
Utilidad.ix['positive']=(Utilidad.ix['resta']+Utilidad.ix['resta'].abs())/2
Utilidad.ix['negative']=(Utilidad.ix['resta']-Utilidad.ix['resta'].abs())/2
print Utilidad
          Argentina  Bolivia  Chile  España  Uruguay
id                                                  
2004              3        6      1       3        2
2005              5        1      4       1        5
resta             2       -5      3      -2        3
positive          2        0      3       0        3
negative          0       -5      0      -2        0

[5 rows x 5 columns]

Some speed comparisons:

%timeit (Utilidad.ix['resta']-Utilidad.ix['resta'].abs())/2
1000 loops, best of 3: 627 µs per loop
In [36]:

%timeit Utilidad.ix['positive'] = np.where(Utilidad.ix['resta'] > 0, Utilidad.ix['resta'], 0)
1000 loops, best of 3: 647 µs per loop
In [38]:

%timeit Utilidad.ix['positive']=np.clip(Utilidad.ix['resta'], 0, 100)
100 loops, best of 3: 2.6 ms per loop
In [45]:

%timeit Utilidad.ix['resta'].clip_upper(0)
1000 loops, best of 3: 1.32 ms per loop

The observation to make here is that negative is the minimum of 0 and the row:

In [11]: np.minimum(df.loc['resta'], 0)  # negative
Out[11]:
Argentina    0
Bolivia     -5
Chile        0
España      -2
Uruguay      0
Name: resta, dtype: int64

In [12]: np.maximum(df.loc['resta'], 0)  # positive
Out[12]:
Argentina    2
Bolivia      0
Chile        3
España       0
Uruguay      3
Name: resta, dtype: int64

Note: If you are concerned about speed then it would make sense to transpose the DataFrame, since appending columns is much cheaper than appending rows.

You can append a row using loc:

df.loc['negative'] = np.minimum(df.loc['resta'], 0)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM