[英]numbering elements in a pandas dataframe
I am trying to build a small stock trading reporting in pandas. 我正在尝试建立一个以熊猫为单位的小型股票交易报告。 It's getting a little complicated because of subsequent buys and sells.
由于后续的买卖,它变得有点复杂。 Assuming I have my buys and sells in a dataframe:
假设我在一个数据框中进行买卖:
import pandas as pd
data = pd.read_csv("ticker1.csv", delimiter=";")
data['cumsum']=data['quantity'].cumsum(axis=0)
data
Date qty price cumsum
0 2018-01-20 80 20.70 80
1 2018-02-14 90 20.82 170
2 2018-02-19 -100 20.62 70
3 2018-02-27 -70 20.55 0
4 2018-03-13 30 19.80 30
5 2018-03-14 10 19.55 40
6 2018-03-30 -20 20.92 20
7 2018-04-01 -10 20.95 10
8 2018-04-10 -10 21.03 0
9 2018-05-04 25 19.77 25
10 2018-05-31 -10 20.22 15
So there can be "completed" cycles of buying and selling whenever cumsum =0 (no short-selling). 因此,只要cumsum = 0(没有卖空),就可以有“完整”的买卖周期。 In this example, there would be an open position of 15 at the end.
在此示例中,最后的空头位置为15。 In order to analyze the trades, I'd like to group them like this:
为了分析交易,我想像这样对它们进行分组:
Date qty price cumsum group
0 2018-01-20 80 20.70 80 1
1 2018-02-14 90 20.82 170 1
2 2018-02-19 -100 20.62 70 1
3 2018-02-27 -70 20.55 0 1
4 2018-03-13 30 19.80 30 2
5 2018-03-14 10 19.55 40 2
6 2018-03-30 -20 20.92 20 2
7 2018-04-01 -10 20.95 10 2
8 2018-04-10 -10 21.03 0 2
9 2018-05-04 25 19.77 25 3
10 2018-05-31 -10 20.22 15 3
I am trying to group the transactions until the next time cumsum =0. 我正在尝试将交易分组,直到下一次总和= = 0。 Then I could loop over the groupings for further analysis (eg see if it was a winning or losing trade, # days between first buy and last sale etc.) and I would be able to see that in this case there is an open position at the moment (if last value for cumsum != 0).
然后,我可以遍历分组以进行进一步的分析(例如,查看这是一次成功的交易还是失败的交易,从首次购买到最后一次交易之间的#天等),并且我可以看到在这种情况下存在一个未结头寸时刻(如果累计值的最后一个值= 0)。
Could someone please give me a hint how I could realize the grouping? 有人可以给我提示我如何实现分组吗?
Thanks 谢谢
Coincidentally, one solution is to apply Series.cumsum()
on the column named cumsum
: Series.cumsum()
是,一种解决方案是在名为cumsum
的列上应用Series.cumsum()
:
df['group'] = (df['cumsum'].shift() == 0).astype(int).cumsum() + 1
df
Date qty price cumsum group
0 2018-01-20 80 20.70 80 1
1 2018-02-14 90 20.82 170 1
2 2018-02-19 -100 20.62 70 1
3 2018-02-27 -70 20.55 0 1
4 2018-03-13 30 19.80 30 2
5 2018-03-14 10 19.55 40 2
6 2018-03-30 -20 20.92 20 2
7 2018-04-01 -10 20.95 10 2
8 2018-04-10 -10 21.03 0 2
9 2018-05-04 25 19.77 25 3
10 2018-05-31 -10 20.22 15 3
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.