简体   繁体   English

Python - 我怎样才能找到一条线在给定点的角度?

[英]Python - how can i find the angle of a line at a given point?

I'm dealing with simple OHLC time series data, here is a sample:我正在处理简单的 OHLC 时间序列数据,这是一个示例:

2021-02-26 08:00:00  51491.322786
2021-02-26 12:00:00  51373.462137
2021-02-26 16:00:00  51244.591670
2021-02-26 20:00:00  51061.134204
2021-02-27 00:00:00  50985.592434
2021-02-27 04:00:00  50923.287370
2021-02-27 08:00:00  50842.103282
2021-02-27 12:00:00  50695.160604
2021-02-27 16:00:00  50608.462150
2021-02-27 20:00:00  50455.235146
2021-02-28 00:00:00  50177.377531
2021-02-28 04:00:00  49936.652091
2021-02-28 08:00:00  49860.396537
2021-02-28 12:00:00  49651.901082
2021-02-28 16:00:00  49625.153441
2021-02-28 20:00:00  49570.275193
2021-03-01 00:00:00  49531.874272
2021-03-01 04:00:00  49510.381676
2021-03-01 08:00:00  49486.289712
2021-03-01 12:00:00  49481.496645
2021-03-01 16:00:00  49469.806692
2021-03-01 20:00:00  49471.958606
2021-03-02 00:00:00  49462.095568
2021-03-02 04:00:00  49453.473575
2021-03-02 08:00:00  49438.986536
2021-03-02 12:00:00  49409.492007
2021-03-02 16:00:00  49356.563396
2021-03-02 20:00:00  49331.037118
2021-03-03 00:00:00  49297.823947
2021-03-03 04:00:00  49322.049974
2021-03-03 08:00:00  49461.314013
2021-03-03 12:00:00  49515.137712
2021-03-03 16:00:00  49571.990877
2021-03-03 20:00:00  49592.320461
2021-03-04 00:00:00  49592.249409
2021-03-04 04:00:00  49593.938380
2021-03-04 08:00:00  49593.055971
2021-03-04 12:00:00  49592.025698
2021-03-04 16:00:00  49585.661437
2021-03-04 20:00:00  49578.693824
2021-03-05 00:00:00  49543.067346
2021-03-05 04:00:00  49540.706794
2021-03-05 08:00:00  49513.586831
2021-03-05 12:00:00  49494.990328
2021-03-05 16:00:00  49493.807248
2021-03-05 20:00:00  49461.133698
2021-03-06 00:00:00  49432.770930
2021-03-06 04:00:00  49412.087821
2021-03-06 08:00:00  49368.106499
2021-03-06 12:00:00  49290.581114
2021-03-06 16:00:00  49272.222740
2021-03-06 20:00:00  49269.814982
2021-03-07 00:00:00  49270.328825
2021-03-07 04:00:00  49293.664209
2021-03-07 08:00:00  49339.999430
2021-03-07 12:00:00  49404.798067
2021-03-07 16:00:00  49450.447631
2021-03-07 20:00:00  49528.402294
2021-03-08 00:00:00  49571.353158
2021-03-08 04:00:00  49572.687451
2021-03-08 08:00:00  49597.518988
2021-03-08 12:00:00  49648.407014
2021-03-08 16:00:00  49708.063384
2021-03-08 20:00:00  49862.237773
2021-03-09 00:00:00  50200.833030
2021-03-09 04:00:00  50446.201489
2021-03-09 08:00:00  50727.063301
2021-03-09 12:00:00  50952.697141
2021-03-09 16:00:00  51152.798741
2021-03-09 20:00:00  51392.873289
2021-03-10 00:00:00  51472.273233
2021-03-10 04:00:00  51601.351944
2021-03-10 08:00:00  51759.387477
2021-03-10 12:00:00  52053.982892
2021-03-10 16:00:00  52437.071119
2021-03-10 20:00:00  52648.225156

I'm trying to find a way to get how inclined or steep the line is at each point.我试图找到一种方法来获得线在每个点的倾斜或陡峭程度。 Basically i only need to know if the line is going up, down or sideways and by how much, so the ideal would be to get some sort of coefficient or number that tells me how steep the line is.基本上我只需要知道这条线是向上、向下还是横向以及上升了多少,所以理想的情况是得到某种系数或数字来告诉我这条线有多陡。

In order to do that, i had the idea of calculating the slope, so i tried the following code that i got from here :为了做到这一点,我有了计算斜率的想法,所以我尝试了从这里得到的以下代码:

def slope( close, length=None, as_angle=None, to_degrees=None, vertical=None, offset=None, **kwargs):
    """Indicator: Slope"""
    # Validate arguments
    length = int(length) if length and length > 0 else 1
    as_angle = True if isinstance(as_angle, bool) else False
    to_degrees = True if isinstance(to_degrees, bool) else False
    close = verify_series(close, length)
    offset = get_offset(offset)

    if close is None: return

    # Calculate Result
    slope = close.diff(length) / length
    if as_angle:
        slope = slope.apply(npAtan)
        if to_degrees:
            slope *= 180 / npPi

    # Offset
    if offset != 0:
        slope = slope.shift(offset)

    # Handle fills
    if "fillna" in kwargs:
        slope.fillna(kwargs["fillna"], inplace=True)
    if "fill_method" in kwargs:
        slope.fillna(method=kwargs["fill_method"], inplace=True)

    # Name and Categorize it
    slope.name = f"SLOPE_{length}" if not as_angle else f"ANGLE{'d' if to_degrees else 'r'}_{length}"
    slope.category = "momentum"

    return slope 

Here is a sample of the output:这是 output 的示例:

2021-02-26 08:00:00  51491.322786 -110.850644
2021-02-26 12:00:00  51373.462137 -117.860648
2021-02-26 16:00:00  51244.591670 -128.870468
2021-02-26 20:00:00  51061.134204 -183.457466
2021-02-27 00:00:00  50985.592434  -75.541770
2021-02-27 04:00:00  50923.287370  -62.305064
2021-02-27 08:00:00  50842.103282  -81.184088
2021-02-27 12:00:00  50695.160604 -146.942678
2021-02-27 16:00:00  50608.462150  -86.698454
2021-02-27 20:00:00  50455.235146 -153.227004
2021-02-28 00:00:00  50177.377531 -277.857615
2021-02-28 04:00:00  49936.652091 -240.725440
2021-02-28 08:00:00  49860.396537  -76.255553
2021-02-28 12:00:00  49651.901082 -208.495455
2021-02-28 16:00:00  49625.153441  -26.747641
2021-02-28 20:00:00  49570.275193  -54.878249
2021-03-01 00:00:00  49531.874272  -38.400921
2021-03-01 04:00:00  49510.381676  -21.492596
2021-03-01 08:00:00  49486.289712  -24.091964
2021-03-01 12:00:00  49481.496645   -4.793067
2021-03-01 16:00:00  49469.806692  -11.689953
2021-03-01 20:00:00  49471.958606    2.151914
2021-03-02 00:00:00  49462.095568   -9.863038
2021-03-02 04:00:00  49453.473575   -8.621994
2021-03-02 08:00:00  49438.986536  -14.487039
2021-03-02 12:00:00  49409.492007  -29.494528
2021-03-02 16:00:00  49356.563396  -52.928611
2021-03-02 20:00:00  49331.037118  -25.526278
2021-03-03 00:00:00  49297.823947  -33.213171
2021-03-03 04:00:00  49322.049974   24.226027
2021-03-03 08:00:00  49461.314013  139.264040
2021-03-03 12:00:00  49515.137712   53.823699
2021-03-03 16:00:00  49571.990877   56.853165
2021-03-03 20:00:00  49592.320461   20.329584
2021-03-04 00:00:00  49592.249409   -0.071052
2021-03-04 04:00:00  49593.938380    1.688971
2021-03-04 08:00:00  49593.055971   -0.882409
2021-03-04 12:00:00  49592.025698   -1.030273
2021-03-04 16:00:00  49585.661437   -6.364260
2021-03-04 20:00:00  49578.693824   -6.967614
2021-03-05 00:00:00  49543.067346  -35.626478
2021-03-05 04:00:00  49540.706794   -2.360551
2021-03-05 08:00:00  49513.586831  -27.119963
2021-03-05 12:00:00  49494.990328  -18.596504
2021-03-05 16:00:00  49493.807248   -1.183080
2021-03-05 20:00:00  49461.133698  -32.673550
2021-03-06 00:00:00  49432.770930  -28.362769
2021-03-06 04:00:00  49412.087821  -20.683109
2021-03-06 08:00:00  49368.106499  -43.981322
2021-03-06 12:00:00  49290.581114  -77.525385
2021-03-06 16:00:00  49272.222740  -18.358373
2021-03-06 20:00:00  49269.814982   -2.407758
2021-03-07 00:00:00  49270.328825    0.513843
2021-03-07 04:00:00  49293.664209   23.335384
2021-03-07 08:00:00  49339.999430   46.335221
2021-03-07 12:00:00  49404.798067   64.798637
2021-03-07 16:00:00  49450.447631   45.649564
2021-03-07 20:00:00  49528.402294   77.954663
2021-03-08 00:00:00  49571.353158   42.950863
2021-03-08 04:00:00  49572.687451    1.334294
2021-03-08 08:00:00  49597.518988   24.831537
2021-03-08 12:00:00  49648.407014   50.888026
2021-03-08 16:00:00  49708.063384   59.656369
2021-03-08 20:00:00  49862.237773  154.174389
2021-03-09 00:00:00  50200.833030  338.595257
2021-03-09 04:00:00  50446.201489  245.368460
2021-03-09 08:00:00  50727.063301  280.861811
2021-03-09 12:00:00  50952.697141  225.633840
2021-03-09 16:00:00  51152.798741  200.101599
2021-03-09 20:00:00  51392.873289  240.074549
2021-03-10 00:00:00  51472.273233   79.399943
2021-03-10 04:00:00  51601.351944  129.078712
2021-03-10 08:00:00  51759.387477  158.035533
2021-03-10 12:00:00  52053.982892  294.595415
2021-03-10 16:00:00  52437.071119  383.088226
2021-03-10 20:00:00  52648.225156  211.154038

This works, but the problem is that the result of the slope depends a lot on the magnitude of the data i'm providing, which means that with lower prices i'm going to get much lower slope values, with higher values higher slope values, but since i'm performing some sort of analysis i need something more "universal" that would give me the inclination of the line i'm plotting without depending on the magnitude of the data i'm using.这行得通,但问题是斜率的结果很大程度上取决于我提供的数据的大小,这意味着价格越低,我将获得更低的斜率值,更高的值斜率值越高,但由于我正在执行某种分析,因此我需要一些更“通用”的东西,它可以让我知道我正在绘制的线的倾斜度,而不取决于我正在使用的数据的大小。 Is it possible?可能吗? Any kind of advice is appreciated.任何形式的建议表示赞赏。

I am not sure about what you are trying to achieve, but find the slope and angle of a series of points can be done in the following manner.我不确定您要达到什么目标,但是可以通过以下方式找到一系列点的斜率和角度。

Suppose your dataframe is given by:假设您的 dataframe 由下式给出:

   Date       measure
0   2021-02-26 08:00  51491.322786
1   2021-02-26 12:00  51373.462137
2   2021-02-26 16:00  51244.591670
3   2021-02-26 20:00  51061.134204
4   2021-02-27 00:00  50985.592434
..               ...           ...
71  2021-03-10 04:00  51601.351944
72  2021-03-10 08:00  51759.387477
73  2021-03-10 12:00  52053.982892
74  2021-03-10 16:00  52437.071119
75  2021-03-10 20:00  52648.225156

which is exactly what you've posted.这正是您发布的内容。 Then, you can define a function slope_and_angle as然后,您可以将 function slope_and_angle定义为

y = range(len(df['measure'])) ##Make sure you get the range of values

def slope_and_angle(df):
    for i in y:
        df['slope'] = (y[i-1] - y[1]) / (df['measure'].diff())
        df['angle'] = np.rad2deg(np.arctan2(y[i-1] - y[1], df['measure'].diff()))
    return df

which returns:返回:

            Date       measure     slope       angle
0   2021-02-26 08:00  51491.322786       NaN         NaN
1   2021-02-26 12:00  51373.462137 -0.619376  148.226940
2   2021-02-26 16:00  51244.591670 -0.566460  150.470170
3   2021-02-26 20:00  51061.134204 -0.397912  158.301777
4   2021-02-27 00:00  50985.592434 -0.966353  135.980320
..               ...           ...       ...         ...
71  2021-03-10 04:00  51601.351944  0.565546   29.490174
72  2021-03-10 08:00  51759.387477  0.461921   24.793227
73  2021-03-10 12:00  52053.982892  0.247797   13.917410
74  2021-03-10 16:00  52437.071119  0.190557   10.788745
75  2021-03-10 20:00  52648.225156  0.345719   19.071249

What you returned in your ouput example was just df['measure'].diff() .您在输出示例中返回的只是df['measure'].diff()

There are things you can do, but there is no universal thing that will always work well - you need to understand your data and choose something appropriate for your case.有些事情你可以做,但没有通用的事情总是能很好地工作——你需要了解你的数据并选择适合你情况的东西。

For example, 100 random numbers between 0 and 1, sampled every 4 hours例如,100 个 0 到 1 之间的随机数,每 4 小时采样一次

import numpy as np
import pandas as pd

df = pd.DataFrame({
    'timestamp': pd.date_range("2021-01-01", periods=100, freq="4H"),
    'value': np.random.random(100)
})
# df:
#       timestamp               value
# 0     2021-01-01 00:00:00     0.780008
# 1     2021-01-01 04:00:00     0.689576
# 2     2021-01-01 08:00:00     0.700937
# 3     2021-01-01 12:00:00     0.756724
# 4     2021-01-01 16:00:00     0.928890
# etc

We can calculate the gradient quite easily:我们可以很容易地计算梯度:

differences = df.diff()
gradient = 3600 * differences.value / differences.timestamp.dt.seconds
# gradient: max value 0.1979, min value -0.2432
# 0         NaN
# 1    0.033912
# 2    0.045422
# 3   -0.001827
# 4   -0.225796

The gradient is the change in value per hour, ignoring any wrinkles such as missing values, repeated time points etc.梯度是每小时值的变化,忽略任何皱纹,如缺失值、重复时间点等。

Now, as you observe, if the magnitude of these numbers increases, the gradient increases.现在,正如您所观察到的,如果这些数字的大小增加,梯度就会增加。 For example, if I make value 100 times bigger:例如,如果我将value放大 100 倍:

df['value100'] = 100 * df.value
differences = df.diff()
gradient = 3600 * differences.value100 / differences.timestamp.dt.seconds
print(gradient.max(), gradient.min())
# gradient: max value 19.79, min value -24.32
# 0          NaN
# 1     3.391221
# 2     4.542248
# 3    -0.182714
# 4   -22.579588

Here we see that the gradients are also 100 times bigger - exactly as would be expected.在这里,我们看到梯度也大了 100 倍——正如预期的那样。

This suggests that we could just divide by some number, but the question then becomes what number to use?这表明我们可以只除以某个数字,但问题就变成了使用什么数字? This is where understanding your data is important.这就是理解数据很重要的地方。

One approach is to use the range of the data.一种方法是使用数据的范围。 This is similar to what you would see if you plotted a graph using matplotlib - the y scale would fit the maximum and minimum values.这类似于您使用matplotlib绘制图表时所看到的 - y比例将适合最大值和最小值。 For example:例如:

sf = df.value.max() - df.value.min()
sf100 = df.value100.max() - df.value100.min()
differences = df.diff()

gradient = differences.value / sf
gradient100 = differences.value100 / sf100

# gradient, gradient100
# nan,      nan
# 0.1379,   0.1379
# 0.1847,   0.1847
# -0.0074,  -0.0074
# -0.9184,  -0.9184

As you can see, the two gradients now match each other.如您所见,两个渐变现在相互匹配。 This approach works well when there is a simple linear scaling.当存在简单的线性缩放时,这种方法效果很好。

However, consider a different case - one where the extra range comes about because of an outlier.但是,请考虑另一种情况 - 由于异常值而产生额外范围的情况。

df['value_outlier'] = df.value
df.loc[50, 'value_outlier'] = 100  # Just set the 50th value to 100

sf = df.value.max() - df.value.min()
sf_outlier = df.value_outlier.max() - df.value_outlier.min()

differences = df.diff()
gradient = differences.value / sf
gradient_outlier = differences.value_outlier / sf_outlier

# gradient, gradient_outlier
# nan,      nan
# 0.1379,   0.0014
# 0.1847,   0.0018
# -0.0074,  -0.0001
# -0.9184,  -0.0090
# 0.5792,   0.0057

This doesn't look so good.这看起来不太好。 The reason why is that we have inflated the range of value_outlier without changing the actual range between most of the points.原因是我们在没有改变大多数点之间的实际范围的情况下夸大了value_outlier的范围。

You can fix this - one approach is to use the interquartile range as the scale factor:您可以解决此问题 - 一种方法是使用四分位数范围作为比例因子:

sf = df.value.quantile(0.75) - df.value.quantile(0.25)
sf100 = df.value100.quantile(0.75) - df.value100.quantile(0.25)
sf_outlier = df.value_outlier.quantile(0.75) - df.value_outlier.quantile(0.25)

differences = df.diff()
gradient = differences.value / sf
gradient100 = differences.value100 / sf100
gradient_outlier = differences.value_outlier / sf_outlier

for a, b, c in zip(gradient, gradient100, gradient_outlier):
    print(f'{a:.4f}, {b:.4f}, {c:.4f}')

# gradient, gradient100, gradient_outlier
# nan,      nan,         nan
# 0.2953,   0.2953,      0.3202
# 0.3956,   0.3956,      0.4289
# -0.0159,  -0.0159,     -0.0173
# -1.9665,  -1.9665,     -2.1320
# 1.2403,   1.2403,      1.3447

The values are never going to match perfectly, but they should be approximately the same.这些值永远不会完美匹配,但它们应该大致相同。 And, of course, you're going to have an enormous difference where that outlier is.而且,当然,离群值所在的位置会有很大的不同。

So, the key message is you can do something , but you need to make sure that it is an appropriate thing for your data.因此,关键信息是您可以做某事,但您需要确保它适合您的数据。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何在python中给定点(x,y)和线与y轴的夹角的情况下找到线的斜率(m)? - How do I find the slope (m) for a line given a point (x,y) on the line and the line's angle from the y axis in python? 如何找到一个点的坐标,使其与给定的线成 135 度角? - How to find the coordinates of a point such that it makes a 135 degree angle with a given line? 如何从 x、y、直线和角度构造一个点? - How do I construct a point from x,y, line and angle? 给定圆上的 2 个点和它们之间的角度,如何找到中心(在 python 中)? - Given 2 points on a circle and the angle between, how to find the center (in python)? Python:使用到点 A 的距离(x0,y0)和角度找到给定点 B 的 ax,y 坐标 - Python: find a x,y coordinate for a given point B using the distance from the point A (x0, y0) and the angle 如何在给定速度和角度值的情况下绘制 Turtle? - How can I draw in Turtle given speed and angle value? 如何在给定线的另一个点和垂直线上的2个点的线上找到点的X坐标? - How to find the X coordinate of a point on a line given another point of the line and the 2 points on a perpendicular line? 如何在 python 的 PCA 图中找到数据点? - How can I find the data point in my PCA plot in python? 根据随机点求两条相连线段的角度 - Find the angle of two connected line segments based upon a random point 如何以任意角度绘制文本并平行于一条线? - How can I draw text at any angle above and parallel to a line?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM