简体   繁体   English

熊猫数据框中的编号元素

[英]numbering elements in a pandas dataframe

I am trying to build a small stock trading reporting in pandas. 我正在尝试建立一个以熊猫为单位的小型股票交易报告。 It's getting a little complicated because of subsequent buys and sells. 由于后续的买卖,它变得有点复杂。 Assuming I have my buys and sells in a dataframe: 假设我在一个数据框中进行买卖:

         import pandas as pd
         data = pd.read_csv("ticker1.csv", delimiter=";")
         data['cumsum']=data['quantity'].cumsum(axis=0)

         data
            Date              qty price      cumsum
         0  2018-01-20        80  20.70      80      
         1  2018-02-14        90  20.82     170      
         2  2018-02-19      -100  20.62      70     
         3  2018-02-27       -70  20.55       0     
         4  2018-03-13        30  19.80      30      
         5  2018-03-14        10  19.55      40      
         6  2018-03-30       -20  20.92      20      
         7  2018-04-01       -10  20.95      10      
         8  2018-04-10       -10  21.03       0      
         9  2018-05-04        25  19.77      25     
         10 2018-05-31       -10  20.22      15      

So there can be "completed" cycles of buying and selling whenever cumsum =0 (no short-selling). 因此,只要cumsum = 0(没有卖空),就可以有“完整”的买卖周期。 In this example, there would be an open position of 15 at the end. 在此示例中,最后的空头位置为15。 In order to analyze the trades, I'd like to group them like this: 为了分析交易,我想像这样对它们进行分组:

            Date              qty price      cumsum   group 
         0  2018-01-20        80  20.70      80       1
         1  2018-02-14        90  20.82     170       1
         2  2018-02-19      -100  20.62      70       1
         3  2018-02-27       -70  20.55       0       1
         4  2018-03-13        30  19.80      30       2
         5  2018-03-14        10  19.55      40       2
         6  2018-03-30       -20  20.92      20       2
         7  2018-04-01       -10  20.95      10       2
         8  2018-04-10       -10  21.03       0       2
         9  2018-05-04        25  19.77      25       3
         10 2018-05-31       -10  20.22      15       3

I am trying to group the transactions until the next time cumsum =0. 我正在尝试将交易分组,直到下一次总和= = 0。 Then I could loop over the groupings for further analysis (eg see if it was a winning or losing trade, # days between first buy and last sale etc.) and I would be able to see that in this case there is an open position at the moment (if last value for cumsum != 0). 然后,我可以遍历分组以进行进一步的分析(例如,查看这是一次成功的交易还是失败的交易,从首次购买到最后一次交易之间的#天等),并且我可以看到在这种情况下存在一个未结头寸时刻(如果累计值的最后一个值= 0)。

Could someone please give me a hint how I could realize the grouping? 有人可以给我提示我如何实现分组吗?

Thanks 谢谢

Coincidentally, one solution is to apply Series.cumsum() on the column named cumsum : Series.cumsum()是,一种解决方案是在名为cumsum的列上应用Series.cumsum()

df['group'] = (df['cumsum'].shift() == 0).astype(int).cumsum() + 1
df

          Date  qty  price  cumsum  group
0   2018-01-20   80  20.70      80      1
1   2018-02-14   90  20.82     170      1
2   2018-02-19 -100  20.62      70      1
3   2018-02-27  -70  20.55       0      1
4   2018-03-13   30  19.80      30      2
5   2018-03-14   10  19.55      40      2
6   2018-03-30  -20  20.92      20      2
7   2018-04-01  -10  20.95      10      2
8   2018-04-10  -10  21.03       0      2
9   2018-05-04   25  19.77      25      3
10  2018-05-31  -10  20.22      15      3

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM