簡體   English   中英

Python - 我怎樣才能找到一條線在給定點的角度?

[英]Python - how can i find the angle of a line at a given point?

我正在處理簡單的 OHLC 時間序列數據,這是一個示例:

2021-02-26 08:00:00  51491.322786
2021-02-26 12:00:00  51373.462137
2021-02-26 16:00:00  51244.591670
2021-02-26 20:00:00  51061.134204
2021-02-27 00:00:00  50985.592434
2021-02-27 04:00:00  50923.287370
2021-02-27 08:00:00  50842.103282
2021-02-27 12:00:00  50695.160604
2021-02-27 16:00:00  50608.462150
2021-02-27 20:00:00  50455.235146
2021-02-28 00:00:00  50177.377531
2021-02-28 04:00:00  49936.652091
2021-02-28 08:00:00  49860.396537
2021-02-28 12:00:00  49651.901082
2021-02-28 16:00:00  49625.153441
2021-02-28 20:00:00  49570.275193
2021-03-01 00:00:00  49531.874272
2021-03-01 04:00:00  49510.381676
2021-03-01 08:00:00  49486.289712
2021-03-01 12:00:00  49481.496645
2021-03-01 16:00:00  49469.806692
2021-03-01 20:00:00  49471.958606
2021-03-02 00:00:00  49462.095568
2021-03-02 04:00:00  49453.473575
2021-03-02 08:00:00  49438.986536
2021-03-02 12:00:00  49409.492007
2021-03-02 16:00:00  49356.563396
2021-03-02 20:00:00  49331.037118
2021-03-03 00:00:00  49297.823947
2021-03-03 04:00:00  49322.049974
2021-03-03 08:00:00  49461.314013
2021-03-03 12:00:00  49515.137712
2021-03-03 16:00:00  49571.990877
2021-03-03 20:00:00  49592.320461
2021-03-04 00:00:00  49592.249409
2021-03-04 04:00:00  49593.938380
2021-03-04 08:00:00  49593.055971
2021-03-04 12:00:00  49592.025698
2021-03-04 16:00:00  49585.661437
2021-03-04 20:00:00  49578.693824
2021-03-05 00:00:00  49543.067346
2021-03-05 04:00:00  49540.706794
2021-03-05 08:00:00  49513.586831
2021-03-05 12:00:00  49494.990328
2021-03-05 16:00:00  49493.807248
2021-03-05 20:00:00  49461.133698
2021-03-06 00:00:00  49432.770930
2021-03-06 04:00:00  49412.087821
2021-03-06 08:00:00  49368.106499
2021-03-06 12:00:00  49290.581114
2021-03-06 16:00:00  49272.222740
2021-03-06 20:00:00  49269.814982
2021-03-07 00:00:00  49270.328825
2021-03-07 04:00:00  49293.664209
2021-03-07 08:00:00  49339.999430
2021-03-07 12:00:00  49404.798067
2021-03-07 16:00:00  49450.447631
2021-03-07 20:00:00  49528.402294
2021-03-08 00:00:00  49571.353158
2021-03-08 04:00:00  49572.687451
2021-03-08 08:00:00  49597.518988
2021-03-08 12:00:00  49648.407014
2021-03-08 16:00:00  49708.063384
2021-03-08 20:00:00  49862.237773
2021-03-09 00:00:00  50200.833030
2021-03-09 04:00:00  50446.201489
2021-03-09 08:00:00  50727.063301
2021-03-09 12:00:00  50952.697141
2021-03-09 16:00:00  51152.798741
2021-03-09 20:00:00  51392.873289
2021-03-10 00:00:00  51472.273233
2021-03-10 04:00:00  51601.351944
2021-03-10 08:00:00  51759.387477
2021-03-10 12:00:00  52053.982892
2021-03-10 16:00:00  52437.071119
2021-03-10 20:00:00  52648.225156

我試圖找到一種方法來獲得線在每個點的傾斜或陡峭程度。 基本上我只需要知道這條線是向上、向下還是橫向以及上升了多少,所以理想的情況是得到某種系數或數字來告訴我這條線有多陡。

為了做到這一點,我有了計算斜率的想法,所以我嘗試了從這里得到的以下代碼:

def slope( close, length=None, as_angle=None, to_degrees=None, vertical=None, offset=None, **kwargs):
    """Indicator: Slope"""
    # Validate arguments
    length = int(length) if length and length > 0 else 1
    as_angle = True if isinstance(as_angle, bool) else False
    to_degrees = True if isinstance(to_degrees, bool) else False
    close = verify_series(close, length)
    offset = get_offset(offset)

    if close is None: return

    # Calculate Result
    slope = close.diff(length) / length
    if as_angle:
        slope = slope.apply(npAtan)
        if to_degrees:
            slope *= 180 / npPi

    # Offset
    if offset != 0:
        slope = slope.shift(offset)

    # Handle fills
    if "fillna" in kwargs:
        slope.fillna(kwargs["fillna"], inplace=True)
    if "fill_method" in kwargs:
        slope.fillna(method=kwargs["fill_method"], inplace=True)

    # Name and Categorize it
    slope.name = f"SLOPE_{length}" if not as_angle else f"ANGLE{'d' if to_degrees else 'r'}_{length}"
    slope.category = "momentum"

    return slope 

這是 output 的示例:

2021-02-26 08:00:00  51491.322786 -110.850644
2021-02-26 12:00:00  51373.462137 -117.860648
2021-02-26 16:00:00  51244.591670 -128.870468
2021-02-26 20:00:00  51061.134204 -183.457466
2021-02-27 00:00:00  50985.592434  -75.541770
2021-02-27 04:00:00  50923.287370  -62.305064
2021-02-27 08:00:00  50842.103282  -81.184088
2021-02-27 12:00:00  50695.160604 -146.942678
2021-02-27 16:00:00  50608.462150  -86.698454
2021-02-27 20:00:00  50455.235146 -153.227004
2021-02-28 00:00:00  50177.377531 -277.857615
2021-02-28 04:00:00  49936.652091 -240.725440
2021-02-28 08:00:00  49860.396537  -76.255553
2021-02-28 12:00:00  49651.901082 -208.495455
2021-02-28 16:00:00  49625.153441  -26.747641
2021-02-28 20:00:00  49570.275193  -54.878249
2021-03-01 00:00:00  49531.874272  -38.400921
2021-03-01 04:00:00  49510.381676  -21.492596
2021-03-01 08:00:00  49486.289712  -24.091964
2021-03-01 12:00:00  49481.496645   -4.793067
2021-03-01 16:00:00  49469.806692  -11.689953
2021-03-01 20:00:00  49471.958606    2.151914
2021-03-02 00:00:00  49462.095568   -9.863038
2021-03-02 04:00:00  49453.473575   -8.621994
2021-03-02 08:00:00  49438.986536  -14.487039
2021-03-02 12:00:00  49409.492007  -29.494528
2021-03-02 16:00:00  49356.563396  -52.928611
2021-03-02 20:00:00  49331.037118  -25.526278
2021-03-03 00:00:00  49297.823947  -33.213171
2021-03-03 04:00:00  49322.049974   24.226027
2021-03-03 08:00:00  49461.314013  139.264040
2021-03-03 12:00:00  49515.137712   53.823699
2021-03-03 16:00:00  49571.990877   56.853165
2021-03-03 20:00:00  49592.320461   20.329584
2021-03-04 00:00:00  49592.249409   -0.071052
2021-03-04 04:00:00  49593.938380    1.688971
2021-03-04 08:00:00  49593.055971   -0.882409
2021-03-04 12:00:00  49592.025698   -1.030273
2021-03-04 16:00:00  49585.661437   -6.364260
2021-03-04 20:00:00  49578.693824   -6.967614
2021-03-05 00:00:00  49543.067346  -35.626478
2021-03-05 04:00:00  49540.706794   -2.360551
2021-03-05 08:00:00  49513.586831  -27.119963
2021-03-05 12:00:00  49494.990328  -18.596504
2021-03-05 16:00:00  49493.807248   -1.183080
2021-03-05 20:00:00  49461.133698  -32.673550
2021-03-06 00:00:00  49432.770930  -28.362769
2021-03-06 04:00:00  49412.087821  -20.683109
2021-03-06 08:00:00  49368.106499  -43.981322
2021-03-06 12:00:00  49290.581114  -77.525385
2021-03-06 16:00:00  49272.222740  -18.358373
2021-03-06 20:00:00  49269.814982   -2.407758
2021-03-07 00:00:00  49270.328825    0.513843
2021-03-07 04:00:00  49293.664209   23.335384
2021-03-07 08:00:00  49339.999430   46.335221
2021-03-07 12:00:00  49404.798067   64.798637
2021-03-07 16:00:00  49450.447631   45.649564
2021-03-07 20:00:00  49528.402294   77.954663
2021-03-08 00:00:00  49571.353158   42.950863
2021-03-08 04:00:00  49572.687451    1.334294
2021-03-08 08:00:00  49597.518988   24.831537
2021-03-08 12:00:00  49648.407014   50.888026
2021-03-08 16:00:00  49708.063384   59.656369
2021-03-08 20:00:00  49862.237773  154.174389
2021-03-09 00:00:00  50200.833030  338.595257
2021-03-09 04:00:00  50446.201489  245.368460
2021-03-09 08:00:00  50727.063301  280.861811
2021-03-09 12:00:00  50952.697141  225.633840
2021-03-09 16:00:00  51152.798741  200.101599
2021-03-09 20:00:00  51392.873289  240.074549
2021-03-10 00:00:00  51472.273233   79.399943
2021-03-10 04:00:00  51601.351944  129.078712
2021-03-10 08:00:00  51759.387477  158.035533
2021-03-10 12:00:00  52053.982892  294.595415
2021-03-10 16:00:00  52437.071119  383.088226
2021-03-10 20:00:00  52648.225156  211.154038

這行得通,但問題是斜率的結果很大程度上取決於我提供的數據的大小,這意味着價格越低,我將獲得更低的斜率值,更高的值斜率值越高,但由於我正在執行某種分析,因此我需要一些更“通用”的東西,它可以讓我知道我正在繪制的線的傾斜度,而不取決於我正在使用的數據的大小。 可能嗎? 任何形式的建議表示贊賞。

我不確定您要達到什么目標,但是可以通過以下方式找到一系列點的斜率和角度。

假設您的 dataframe 由下式給出:

   Date       measure
0   2021-02-26 08:00  51491.322786
1   2021-02-26 12:00  51373.462137
2   2021-02-26 16:00  51244.591670
3   2021-02-26 20:00  51061.134204
4   2021-02-27 00:00  50985.592434
..               ...           ...
71  2021-03-10 04:00  51601.351944
72  2021-03-10 08:00  51759.387477
73  2021-03-10 12:00  52053.982892
74  2021-03-10 16:00  52437.071119
75  2021-03-10 20:00  52648.225156

這正是您發布的內容。 然后,您可以將 function slope_and_angle定義為

y = range(len(df['measure'])) ##Make sure you get the range of values

def slope_and_angle(df):
    for i in y:
        df['slope'] = (y[i-1] - y[1]) / (df['measure'].diff())
        df['angle'] = np.rad2deg(np.arctan2(y[i-1] - y[1], df['measure'].diff()))
    return df

返回:

            Date       measure     slope       angle
0   2021-02-26 08:00  51491.322786       NaN         NaN
1   2021-02-26 12:00  51373.462137 -0.619376  148.226940
2   2021-02-26 16:00  51244.591670 -0.566460  150.470170
3   2021-02-26 20:00  51061.134204 -0.397912  158.301777
4   2021-02-27 00:00  50985.592434 -0.966353  135.980320
..               ...           ...       ...         ...
71  2021-03-10 04:00  51601.351944  0.565546   29.490174
72  2021-03-10 08:00  51759.387477  0.461921   24.793227
73  2021-03-10 12:00  52053.982892  0.247797   13.917410
74  2021-03-10 16:00  52437.071119  0.190557   10.788745
75  2021-03-10 20:00  52648.225156  0.345719   19.071249

您在輸出示例中返回的只是df['measure'].diff()

有些事情你可以做,但沒有通用的事情總是能很好地工作——你需要了解你的數據並選擇適合你情況的東西。

例如,100 個 0 到 1 之間的隨機數,每 4 小時采樣一次

import numpy as np
import pandas as pd

df = pd.DataFrame({
    'timestamp': pd.date_range("2021-01-01", periods=100, freq="4H"),
    'value': np.random.random(100)
})
# df:
#       timestamp               value
# 0     2021-01-01 00:00:00     0.780008
# 1     2021-01-01 04:00:00     0.689576
# 2     2021-01-01 08:00:00     0.700937
# 3     2021-01-01 12:00:00     0.756724
# 4     2021-01-01 16:00:00     0.928890
# etc

我們可以很容易地計算梯度:

differences = df.diff()
gradient = 3600 * differences.value / differences.timestamp.dt.seconds
# gradient: max value 0.1979, min value -0.2432
# 0         NaN
# 1    0.033912
# 2    0.045422
# 3   -0.001827
# 4   -0.225796

梯度是每小時值的變化,忽略任何皺紋,如缺失值、重復時間點等。

現在,正如您所觀察到的,如果這些數字的大小增加,梯度就會增加。 例如,如果我將value放大 100 倍:

df['value100'] = 100 * df.value
differences = df.diff()
gradient = 3600 * differences.value100 / differences.timestamp.dt.seconds
print(gradient.max(), gradient.min())
# gradient: max value 19.79, min value -24.32
# 0          NaN
# 1     3.391221
# 2     4.542248
# 3    -0.182714
# 4   -22.579588

在這里,我們看到梯度也大了 100 倍——正如預期的那樣。

這表明我們可以只除以某個數字,但問題就變成了使用什么數字? 這就是理解數據很重要的地方。

一種方法是使用數據的范圍。 這類似於您使用matplotlib繪制圖表時所看到的 - y比例將適合最大值和最小值。 例如:

sf = df.value.max() - df.value.min()
sf100 = df.value100.max() - df.value100.min()
differences = df.diff()

gradient = differences.value / sf
gradient100 = differences.value100 / sf100

# gradient, gradient100
# nan,      nan
# 0.1379,   0.1379
# 0.1847,   0.1847
# -0.0074,  -0.0074
# -0.9184,  -0.9184

如您所見,兩個漸變現在相互匹配。 當存在簡單的線性縮放時,這種方法效果很好。

但是,請考慮另一種情況 - 由於異常值而產生額外范圍的情況。

df['value_outlier'] = df.value
df.loc[50, 'value_outlier'] = 100  # Just set the 50th value to 100

sf = df.value.max() - df.value.min()
sf_outlier = df.value_outlier.max() - df.value_outlier.min()

differences = df.diff()
gradient = differences.value / sf
gradient_outlier = differences.value_outlier / sf_outlier

# gradient, gradient_outlier
# nan,      nan
# 0.1379,   0.0014
# 0.1847,   0.0018
# -0.0074,  -0.0001
# -0.9184,  -0.0090
# 0.5792,   0.0057

這看起來不太好。 原因是我們在沒有改變大多數點之間的實際范圍的情況下誇大了value_outlier的范圍。

您可以解決此問題 - 一種方法是使用四分位數范圍作為比例因子:

sf = df.value.quantile(0.75) - df.value.quantile(0.25)
sf100 = df.value100.quantile(0.75) - df.value100.quantile(0.25)
sf_outlier = df.value_outlier.quantile(0.75) - df.value_outlier.quantile(0.25)

differences = df.diff()
gradient = differences.value / sf
gradient100 = differences.value100 / sf100
gradient_outlier = differences.value_outlier / sf_outlier

for a, b, c in zip(gradient, gradient100, gradient_outlier):
    print(f'{a:.4f}, {b:.4f}, {c:.4f}')

# gradient, gradient100, gradient_outlier
# nan,      nan,         nan
# 0.2953,   0.2953,      0.3202
# 0.3956,   0.3956,      0.4289
# -0.0159,  -0.0159,     -0.0173
# -1.9665,  -1.9665,     -2.1320
# 1.2403,   1.2403,      1.3447

這些值永遠不會完美匹配,但它們應該大致相同。 而且,當然,離群值所在的位置會有很大的不同。

因此,關鍵信息是您可以做某事,但您需要確保它適合您的數據。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM