![](/img/trans.png)
[英]How do I find the slope (m) for a line given a point (x,y) on the line and the line's angle from the y axis in python?
[英]Python - how can i find the angle of a line at a given point?
我正在處理簡單的 OHLC 時間序列數據,這是一個示例:
2021-02-26 08:00:00 51491.322786
2021-02-26 12:00:00 51373.462137
2021-02-26 16:00:00 51244.591670
2021-02-26 20:00:00 51061.134204
2021-02-27 00:00:00 50985.592434
2021-02-27 04:00:00 50923.287370
2021-02-27 08:00:00 50842.103282
2021-02-27 12:00:00 50695.160604
2021-02-27 16:00:00 50608.462150
2021-02-27 20:00:00 50455.235146
2021-02-28 00:00:00 50177.377531
2021-02-28 04:00:00 49936.652091
2021-02-28 08:00:00 49860.396537
2021-02-28 12:00:00 49651.901082
2021-02-28 16:00:00 49625.153441
2021-02-28 20:00:00 49570.275193
2021-03-01 00:00:00 49531.874272
2021-03-01 04:00:00 49510.381676
2021-03-01 08:00:00 49486.289712
2021-03-01 12:00:00 49481.496645
2021-03-01 16:00:00 49469.806692
2021-03-01 20:00:00 49471.958606
2021-03-02 00:00:00 49462.095568
2021-03-02 04:00:00 49453.473575
2021-03-02 08:00:00 49438.986536
2021-03-02 12:00:00 49409.492007
2021-03-02 16:00:00 49356.563396
2021-03-02 20:00:00 49331.037118
2021-03-03 00:00:00 49297.823947
2021-03-03 04:00:00 49322.049974
2021-03-03 08:00:00 49461.314013
2021-03-03 12:00:00 49515.137712
2021-03-03 16:00:00 49571.990877
2021-03-03 20:00:00 49592.320461
2021-03-04 00:00:00 49592.249409
2021-03-04 04:00:00 49593.938380
2021-03-04 08:00:00 49593.055971
2021-03-04 12:00:00 49592.025698
2021-03-04 16:00:00 49585.661437
2021-03-04 20:00:00 49578.693824
2021-03-05 00:00:00 49543.067346
2021-03-05 04:00:00 49540.706794
2021-03-05 08:00:00 49513.586831
2021-03-05 12:00:00 49494.990328
2021-03-05 16:00:00 49493.807248
2021-03-05 20:00:00 49461.133698
2021-03-06 00:00:00 49432.770930
2021-03-06 04:00:00 49412.087821
2021-03-06 08:00:00 49368.106499
2021-03-06 12:00:00 49290.581114
2021-03-06 16:00:00 49272.222740
2021-03-06 20:00:00 49269.814982
2021-03-07 00:00:00 49270.328825
2021-03-07 04:00:00 49293.664209
2021-03-07 08:00:00 49339.999430
2021-03-07 12:00:00 49404.798067
2021-03-07 16:00:00 49450.447631
2021-03-07 20:00:00 49528.402294
2021-03-08 00:00:00 49571.353158
2021-03-08 04:00:00 49572.687451
2021-03-08 08:00:00 49597.518988
2021-03-08 12:00:00 49648.407014
2021-03-08 16:00:00 49708.063384
2021-03-08 20:00:00 49862.237773
2021-03-09 00:00:00 50200.833030
2021-03-09 04:00:00 50446.201489
2021-03-09 08:00:00 50727.063301
2021-03-09 12:00:00 50952.697141
2021-03-09 16:00:00 51152.798741
2021-03-09 20:00:00 51392.873289
2021-03-10 00:00:00 51472.273233
2021-03-10 04:00:00 51601.351944
2021-03-10 08:00:00 51759.387477
2021-03-10 12:00:00 52053.982892
2021-03-10 16:00:00 52437.071119
2021-03-10 20:00:00 52648.225156
我試圖找到一種方法來獲得線在每個點的傾斜或陡峭程度。 基本上我只需要知道這條線是向上、向下還是橫向以及上升了多少,所以理想的情況是得到某種系數或數字來告訴我這條線有多陡。
為了做到這一點,我有了計算斜率的想法,所以我嘗試了從這里得到的以下代碼:
def slope( close, length=None, as_angle=None, to_degrees=None, vertical=None, offset=None, **kwargs):
"""Indicator: Slope"""
# Validate arguments
length = int(length) if length and length > 0 else 1
as_angle = True if isinstance(as_angle, bool) else False
to_degrees = True if isinstance(to_degrees, bool) else False
close = verify_series(close, length)
offset = get_offset(offset)
if close is None: return
# Calculate Result
slope = close.diff(length) / length
if as_angle:
slope = slope.apply(npAtan)
if to_degrees:
slope *= 180 / npPi
# Offset
if offset != 0:
slope = slope.shift(offset)
# Handle fills
if "fillna" in kwargs:
slope.fillna(kwargs["fillna"], inplace=True)
if "fill_method" in kwargs:
slope.fillna(method=kwargs["fill_method"], inplace=True)
# Name and Categorize it
slope.name = f"SLOPE_{length}" if not as_angle else f"ANGLE{'d' if to_degrees else 'r'}_{length}"
slope.category = "momentum"
return slope
這是 output 的示例:
2021-02-26 08:00:00 51491.322786 -110.850644
2021-02-26 12:00:00 51373.462137 -117.860648
2021-02-26 16:00:00 51244.591670 -128.870468
2021-02-26 20:00:00 51061.134204 -183.457466
2021-02-27 00:00:00 50985.592434 -75.541770
2021-02-27 04:00:00 50923.287370 -62.305064
2021-02-27 08:00:00 50842.103282 -81.184088
2021-02-27 12:00:00 50695.160604 -146.942678
2021-02-27 16:00:00 50608.462150 -86.698454
2021-02-27 20:00:00 50455.235146 -153.227004
2021-02-28 00:00:00 50177.377531 -277.857615
2021-02-28 04:00:00 49936.652091 -240.725440
2021-02-28 08:00:00 49860.396537 -76.255553
2021-02-28 12:00:00 49651.901082 -208.495455
2021-02-28 16:00:00 49625.153441 -26.747641
2021-02-28 20:00:00 49570.275193 -54.878249
2021-03-01 00:00:00 49531.874272 -38.400921
2021-03-01 04:00:00 49510.381676 -21.492596
2021-03-01 08:00:00 49486.289712 -24.091964
2021-03-01 12:00:00 49481.496645 -4.793067
2021-03-01 16:00:00 49469.806692 -11.689953
2021-03-01 20:00:00 49471.958606 2.151914
2021-03-02 00:00:00 49462.095568 -9.863038
2021-03-02 04:00:00 49453.473575 -8.621994
2021-03-02 08:00:00 49438.986536 -14.487039
2021-03-02 12:00:00 49409.492007 -29.494528
2021-03-02 16:00:00 49356.563396 -52.928611
2021-03-02 20:00:00 49331.037118 -25.526278
2021-03-03 00:00:00 49297.823947 -33.213171
2021-03-03 04:00:00 49322.049974 24.226027
2021-03-03 08:00:00 49461.314013 139.264040
2021-03-03 12:00:00 49515.137712 53.823699
2021-03-03 16:00:00 49571.990877 56.853165
2021-03-03 20:00:00 49592.320461 20.329584
2021-03-04 00:00:00 49592.249409 -0.071052
2021-03-04 04:00:00 49593.938380 1.688971
2021-03-04 08:00:00 49593.055971 -0.882409
2021-03-04 12:00:00 49592.025698 -1.030273
2021-03-04 16:00:00 49585.661437 -6.364260
2021-03-04 20:00:00 49578.693824 -6.967614
2021-03-05 00:00:00 49543.067346 -35.626478
2021-03-05 04:00:00 49540.706794 -2.360551
2021-03-05 08:00:00 49513.586831 -27.119963
2021-03-05 12:00:00 49494.990328 -18.596504
2021-03-05 16:00:00 49493.807248 -1.183080
2021-03-05 20:00:00 49461.133698 -32.673550
2021-03-06 00:00:00 49432.770930 -28.362769
2021-03-06 04:00:00 49412.087821 -20.683109
2021-03-06 08:00:00 49368.106499 -43.981322
2021-03-06 12:00:00 49290.581114 -77.525385
2021-03-06 16:00:00 49272.222740 -18.358373
2021-03-06 20:00:00 49269.814982 -2.407758
2021-03-07 00:00:00 49270.328825 0.513843
2021-03-07 04:00:00 49293.664209 23.335384
2021-03-07 08:00:00 49339.999430 46.335221
2021-03-07 12:00:00 49404.798067 64.798637
2021-03-07 16:00:00 49450.447631 45.649564
2021-03-07 20:00:00 49528.402294 77.954663
2021-03-08 00:00:00 49571.353158 42.950863
2021-03-08 04:00:00 49572.687451 1.334294
2021-03-08 08:00:00 49597.518988 24.831537
2021-03-08 12:00:00 49648.407014 50.888026
2021-03-08 16:00:00 49708.063384 59.656369
2021-03-08 20:00:00 49862.237773 154.174389
2021-03-09 00:00:00 50200.833030 338.595257
2021-03-09 04:00:00 50446.201489 245.368460
2021-03-09 08:00:00 50727.063301 280.861811
2021-03-09 12:00:00 50952.697141 225.633840
2021-03-09 16:00:00 51152.798741 200.101599
2021-03-09 20:00:00 51392.873289 240.074549
2021-03-10 00:00:00 51472.273233 79.399943
2021-03-10 04:00:00 51601.351944 129.078712
2021-03-10 08:00:00 51759.387477 158.035533
2021-03-10 12:00:00 52053.982892 294.595415
2021-03-10 16:00:00 52437.071119 383.088226
2021-03-10 20:00:00 52648.225156 211.154038
這行得通,但問題是斜率的結果很大程度上取決於我提供的數據的大小,這意味着價格越低,我將獲得更低的斜率值,更高的值斜率值越高,但由於我正在執行某種分析,因此我需要一些更“通用”的東西,它可以讓我知道我正在繪制的線的傾斜度,而不取決於我正在使用的數據的大小。 可能嗎? 任何形式的建議表示贊賞。
我不確定您要達到什么目標,但是可以通過以下方式找到一系列點的斜率和角度。
假設您的 dataframe 由下式給出:
Date measure
0 2021-02-26 08:00 51491.322786
1 2021-02-26 12:00 51373.462137
2 2021-02-26 16:00 51244.591670
3 2021-02-26 20:00 51061.134204
4 2021-02-27 00:00 50985.592434
.. ... ...
71 2021-03-10 04:00 51601.351944
72 2021-03-10 08:00 51759.387477
73 2021-03-10 12:00 52053.982892
74 2021-03-10 16:00 52437.071119
75 2021-03-10 20:00 52648.225156
這正是您發布的內容。 然后,您可以將 function slope_and_angle
定義為
y = range(len(df['measure'])) ##Make sure you get the range of values
def slope_and_angle(df):
for i in y:
df['slope'] = (y[i-1] - y[1]) / (df['measure'].diff())
df['angle'] = np.rad2deg(np.arctan2(y[i-1] - y[1], df['measure'].diff()))
return df
返回:
Date measure slope angle
0 2021-02-26 08:00 51491.322786 NaN NaN
1 2021-02-26 12:00 51373.462137 -0.619376 148.226940
2 2021-02-26 16:00 51244.591670 -0.566460 150.470170
3 2021-02-26 20:00 51061.134204 -0.397912 158.301777
4 2021-02-27 00:00 50985.592434 -0.966353 135.980320
.. ... ... ... ...
71 2021-03-10 04:00 51601.351944 0.565546 29.490174
72 2021-03-10 08:00 51759.387477 0.461921 24.793227
73 2021-03-10 12:00 52053.982892 0.247797 13.917410
74 2021-03-10 16:00 52437.071119 0.190557 10.788745
75 2021-03-10 20:00 52648.225156 0.345719 19.071249
您在輸出示例中返回的只是df['measure'].diff()
。
有些事情你可以做,但沒有通用的事情總是能很好地工作——你需要了解你的數據並選擇適合你情況的東西。
例如,100 個 0 到 1 之間的隨機數,每 4 小時采樣一次
import numpy as np
import pandas as pd
df = pd.DataFrame({
'timestamp': pd.date_range("2021-01-01", periods=100, freq="4H"),
'value': np.random.random(100)
})
# df:
# timestamp value
# 0 2021-01-01 00:00:00 0.780008
# 1 2021-01-01 04:00:00 0.689576
# 2 2021-01-01 08:00:00 0.700937
# 3 2021-01-01 12:00:00 0.756724
# 4 2021-01-01 16:00:00 0.928890
# etc
我們可以很容易地計算梯度:
differences = df.diff()
gradient = 3600 * differences.value / differences.timestamp.dt.seconds
# gradient: max value 0.1979, min value -0.2432
# 0 NaN
# 1 0.033912
# 2 0.045422
# 3 -0.001827
# 4 -0.225796
梯度是每小時值的變化,忽略任何皺紋,如缺失值、重復時間點等。
現在,正如您所觀察到的,如果這些數字的大小增加,梯度就會增加。 例如,如果我將value
放大 100 倍:
df['value100'] = 100 * df.value
differences = df.diff()
gradient = 3600 * differences.value100 / differences.timestamp.dt.seconds
print(gradient.max(), gradient.min())
# gradient: max value 19.79, min value -24.32
# 0 NaN
# 1 3.391221
# 2 4.542248
# 3 -0.182714
# 4 -22.579588
在這里,我們看到梯度也大了 100 倍——正如預期的那樣。
這表明我們可以只除以某個數字,但問題就變成了使用什么數字? 這就是理解數據很重要的地方。
一種方法是使用數據的范圍。 這類似於您使用matplotlib
繪制圖表時所看到的 - y
比例將適合最大值和最小值。 例如:
sf = df.value.max() - df.value.min()
sf100 = df.value100.max() - df.value100.min()
differences = df.diff()
gradient = differences.value / sf
gradient100 = differences.value100 / sf100
# gradient, gradient100
# nan, nan
# 0.1379, 0.1379
# 0.1847, 0.1847
# -0.0074, -0.0074
# -0.9184, -0.9184
如您所見,兩個漸變現在相互匹配。 當存在簡單的線性縮放時,這種方法效果很好。
但是,請考慮另一種情況 - 由於異常值而產生額外范圍的情況。
df['value_outlier'] = df.value
df.loc[50, 'value_outlier'] = 100 # Just set the 50th value to 100
sf = df.value.max() - df.value.min()
sf_outlier = df.value_outlier.max() - df.value_outlier.min()
differences = df.diff()
gradient = differences.value / sf
gradient_outlier = differences.value_outlier / sf_outlier
# gradient, gradient_outlier
# nan, nan
# 0.1379, 0.0014
# 0.1847, 0.0018
# -0.0074, -0.0001
# -0.9184, -0.0090
# 0.5792, 0.0057
這看起來不太好。 原因是我們在沒有改變大多數點之間的實際范圍的情況下誇大了value_outlier
的范圍。
您可以解決此問題 - 一種方法是使用四分位數范圍作為比例因子:
sf = df.value.quantile(0.75) - df.value.quantile(0.25)
sf100 = df.value100.quantile(0.75) - df.value100.quantile(0.25)
sf_outlier = df.value_outlier.quantile(0.75) - df.value_outlier.quantile(0.25)
differences = df.diff()
gradient = differences.value / sf
gradient100 = differences.value100 / sf100
gradient_outlier = differences.value_outlier / sf_outlier
for a, b, c in zip(gradient, gradient100, gradient_outlier):
print(f'{a:.4f}, {b:.4f}, {c:.4f}')
# gradient, gradient100, gradient_outlier
# nan, nan, nan
# 0.2953, 0.2953, 0.3202
# 0.3956, 0.3956, 0.4289
# -0.0159, -0.0159, -0.0173
# -1.9665, -1.9665, -2.1320
# 1.2403, 1.2403, 1.3447
這些值永遠不會完美匹配,但它們應該大致相同。 而且,當然,離群值所在的位置會有很大的不同。
因此,關鍵信息是您可以做某事,但您需要確保它適合您的數據。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.