[英]Count how many blocks in dataframe
我想计算这个包含简单移动平均数据的数据框中有多少下降和上升趋势:
Date SMA_50 SMA_200 Trend Trend CumCount T
2019-09-24 35.559013 38.942979 Down 1
2019-09-25 35.427135 38.904934 Down 2
2019-09-26 35.295066 38.864042 Down 3
2019-09-27 35.165275 38.827087 Down 4
2019-09-30 35.035484 38.788046 Down 5
2020-08-31 34.697343 34.689469 Up 1
2020-09-01 34.768881 34.691034 Up 2
2020-09-02 34.852941 34.690655 Up 3
2020-09-03 34.932827 34.686765 Up 4
2020-09-04 35.009488 34.680598 Up 5
2020-09-08 35.083302 34.675285 Up 6
2020-09-09 35.150474 34.667884 Up 7
2019-10-01 34.895256 38.744687 Down 1
2019-10-02 34.736053 38.701518 Down 2
2019-10-03 34.594877 38.665180 Down 3
2019-10-04 34.466983 38.634488 Down 4
2019-10-07 34.329222 38.605361 Down 5
我希望结果是:
Date SMA_50 SMA_200 Trend Trend CumCount T
2019-09-24 35.559013 38.942979 Down 1 1
2019-09-25 35.427135 38.904934 Down 2 1
2019-09-26 35.295066 38.864042 Down 3 1
2019-09-27 35.165275 38.827087 Down 4 1
2019-09-30 35.035484 38.788046 Down 5 1
2020-08-31 34.697343 34.689469 Up 1 1
2020-09-01 34.768881 34.691034 Up 2 1
2020-09-02 34.852941 34.690655 Up 3 1
2020-09-03 34.932827 34.686765 Up 4 1
2020-09-04 35.009488 34.680598 Up 5 1
2020-09-08 35.083302 34.675285 Up 6 1
2020-09-09 35.150474 34.667884 Up 7 1
2019-10-01 34.895256 38.744687 Down 1 2
2019-10-02 34.736053 38.701518 Down 2 2
2019-10-03 34.594877 38.665180 Down 3 2
2019-10-04 34.466983 38.634488 Down 4 2
2019-10-07 34.329222 38.605361 Down 5 2
根据趋势列,我如何计算数据框中出现多少次上升和下降趋势?
您可以确定趋势何时回到第一个值:
# equal to first Trend?
s = df['Trend'].eq(df['Trend'].iloc[0])
# when is the first value coming back for the first time after a change?
# perform cumsum to get T
df['T'] = (s&~s.shift(fill_value=False)).cumsum()
输出:
Date SMA_50 SMA_200 Trend Trend CumCount T
0 2019-09-24 35.559013 38.942979 Down 1 1
1 2019-09-25 35.427135 38.904934 Down 2 1
2 2019-09-26 35.295066 38.864042 Down 3 1
3 2019-09-27 35.165275 38.827087 Down 4 1
4 2019-09-30 35.035484 38.788046 Down 5 1
5 2020-08-31 34.697343 34.689469 Up 1 1
6 2020-09-01 34.768881 34.691034 Up 2 1
7 2020-09-02 34.852941 34.690655 Up 3 1
8 2020-09-03 34.932827 34.686765 Up 4 1
9 2020-09-04 35.009488 34.680598 Up 5 1
10 2020-09-08 35.083302 34.675285 Up 6 1
11 2020-09-09 35.150474 34.667884 Up 7 1
12 2019-10-01 34.895256 38.744687 Down 1 2
13 2019-10-02 34.736053 38.701518 Down 2 2
14 2019-10-03 34.594877 38.665180 Down 3 2
15 2019-10-04 34.466983 38.634488 Down 4 2
16 2019-10-07 34.329222 38.605361 Down 5 2
# I ignored the "Trend CumCount" column in testing,
# but it shouldn't change anything.
df.loc[df.Trend.ne(df.Trend.shift(-1)), 'T'] = 1
df['T'] = df.groupby('Trend')['T'].cumsum().bfill()
print(df)
输出:
Date SMA_50 SMA_200 Trend T
0 2019-09-24 35.559013 38.942979 Down 1.0
1 2019-09-25 35.427135 38.904934 Down 1.0
2 2019-09-26 35.295066 38.864042 Down 1.0
3 2019-09-27 35.165275 38.827087 Down 1.0
4 2019-09-30 35.035484 38.788046 Down 1.0
5 2020-08-31 34.697343 34.689469 Up 1.0
6 2020-09-01 34.768881 34.691034 Up 1.0
7 2020-09-02 34.852941 34.690655 Up 1.0
8 2020-09-03 34.932827 34.686765 Up 1.0
9 2020-09-04 35.009488 34.680598 Up 1.0
10 2020-09-08 35.083302 34.675285 Up 1.0
11 2020-09-09 35.150474 34.667884 Up 1.0
12 2019-10-01 34.895256 38.744687 Down 2.0
13 2019-10-02 34.736053 38.701518 Down 2.0
14 2019-10-03 34.594877 38.665180 Down 2.0
15 2019-10-04 34.466983 38.634488 Down 2.0
16 2019-10-07 34.329222 38.605361 Down 2.0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.