[英]Create dataframe row with positive numbers and other with negative
I have the following dataframe called Utilidad 我有以下称为Utilidad的数据框
Argentina Bolivia Chile España Uruguay 2004 3 6 1 3 2 2005 5 1 4 1 5
And I calculate the difference between 2004 and 2005 using 我使用以下方法计算2004年和2005年之间的差异
Utilidad.ix['resta']=Utilidad.ix[2005]-Utilidad.ix[2004]
Now I'm trying to create two additional rows, one with the result of the difference when is positive and the other one with the negatives, something like this 现在,我尝试创建另外两行,其中一行的结果是当正数为差时,另一行的结果为负数,如下所示
Argentina Bolivia Chile España Uruguay 2004 3 6 1 3 2 2005 5 1 4 1 5 resta 2 -5 3 -2 3 positive 2 0 3 0 3 negative 0 -5 0 -2 0
The only I have managed to do is to have an additional column which tells me wheter "resta" is positive or not, using 我唯一能做的就是增加一栏,告诉我“ resta”是否为正,使用
Utilidad.ix['boleano'][Utilidad.ix['resta']>0]
Can someone help me to create this two additional rows? 有人可以帮我创建另外两行吗?
Thanks 谢谢
You can use numpy.where
您可以使用
numpy.where
df.ix['positive'] = np.where(df.ix['resta'] > 0, df.ix['resta'], 0)
df.ix['negative'] = np.where(df.ix['resta'] < 0, df.ix['resta'], 0)
numpy.clip
will be handy here, or just calculate it . numpy.clip
在这里会很方便,或者只计算它即可。
In [35]:
Utilidad.ix['positive']=np.clip(Utilidad.ix['resta'], 0, np.inf)
Utilidad.ix['negative']=np.clip(Utilidad.ix['resta'], -np.inf, 0)
#or
Utilidad.ix['positive']=(Utilidad.ix['resta']+Utilidad.ix['resta'].abs())/2
Utilidad.ix['negative']=(Utilidad.ix['resta']-Utilidad.ix['resta'].abs())/2
print Utilidad
Argentina Bolivia Chile España Uruguay
id
2004 3 6 1 3 2
2005 5 1 4 1 5
resta 2 -5 3 -2 3
positive 2 0 3 0 3
negative 0 -5 0 -2 0
[5 rows x 5 columns]
Some speed comparisons: 一些速度比较:
%timeit (Utilidad.ix['resta']-Utilidad.ix['resta'].abs())/2
1000 loops, best of 3: 627 µs per loop
In [36]:
%timeit Utilidad.ix['positive'] = np.where(Utilidad.ix['resta'] > 0, Utilidad.ix['resta'], 0)
1000 loops, best of 3: 647 µs per loop
In [38]:
%timeit Utilidad.ix['positive']=np.clip(Utilidad.ix['resta'], 0, 100)
100 loops, best of 3: 2.6 ms per loop
In [45]:
%timeit Utilidad.ix['resta'].clip_upper(0)
1000 loops, best of 3: 1.32 ms per loop
The observation to make here is that negative is the minimum of 0 and the row: 这里要观察的是负数是0和行的最小值:
In [11]: np.minimum(df.loc['resta'], 0) # negative
Out[11]:
Argentina 0
Bolivia -5
Chile 0
España -2
Uruguay 0
Name: resta, dtype: int64
In [12]: np.maximum(df.loc['resta'], 0) # positive
Out[12]:
Argentina 2
Bolivia 0
Chile 3
España 0
Uruguay 3
Name: resta, dtype: int64
Note: If you are concerned about speed then it would make sense to transpose the DataFrame, since appending columns is much cheaper than appending rows. 注意:如果您担心速度,那么转置DataFrame是有意义的,因为附加列比附加行便宜得多。
You can append a row using loc: 您可以使用loc附加一行:
df.loc['negative'] = np.minimum(df.loc['resta'], 0)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.