[英]Measure variation of data points from a line; To Catch a Dip
(update: I posted the solution and code as an answer rather than edit the question again) (更新:我发布了解决方案和代码作为答案,而不是再次编辑问题)
The ideal line (dashed red) is the plot from starting point with the average rise added with each angle of measurement; 理想线(红色虚线)是从起点开始的图,每个测量角度均加上平均上升; this I obtain via average.
我通过平均获得。 I measured the test data in black.
我用黑色测量了测试数据。 How can I quantify the area of the dip in blue?
如何量化蓝色浸入区域? X-axis is unitized, so slopes and math are simplified.
X轴是统一的,因此简化了斜率和数学运算。
I could determine a cutoff for the size of areas like this and then flag this part for retesting or failure. 我可以确定像这样的区域大小的临界值,然后将其标记为重新测试或失败。 Rarely, there is another dip that appears closer to the right, but setting a cutoff value for standard deviation usually fails those parts.
很少会出现一个更靠近右侧的倾斜,但是设置标准偏差的临界值通常会使这些零件失效。
Diego's answer helped me visualize this. 迭戈的答案帮助我形象地看到了这一点。 Now that I can see what I'm trying to do, I'll work on the algorithm to implement the "homemade dip detector".
现在,我可以看到要执行的操作,接下来将研究实现“自制倾斜检测器”的算法。 :)
:)
I created a test bench to test throttle position sensors I'm selling. 我创建了一个测试台来测试我要销售的节气门位置传感器。 I'm trying to programatically quantify how straight the plot is by analyzing the data collected.
我正在尝试通过分析收集的数据以编程方式量化绘图的直线度。 This one particular model is vexing me.
这个特殊的模型让我很烦恼。
Sample plot of a part I prefer not to sell: 我不想出售的零件的样图:
The X axis are evenly spaced angles of throttle opening. X轴是节气门开度的均匀间隔角。 The stepper motor turns the input shaft, stopping every 0.75° to measure the output on a 10 bit ADC, which gets translated to the Y axis.
步进电机转动输入轴,每0.75°停止一次,以测量10位ADC上的输出,该输出转换为Y轴。 The plot is the translation of
data[idx]
to idx,value
mapped to (x,y)
bitmap coordinates. 该图是
data[idx]
到idx,value
映射到(x,y)
位图坐标。 Then I draw lines between the points within the bitmap using Bresenham's algorithm. 然后,我使用Bresenham算法在位图中的点之间绘制线。
My other TPS products produce amazingly linear output . 我的其他TPS产品产生令人惊讶的线性输出 。
The lower (left) portion of the plot is crucial to normal usage of any motor vehicle; 该图的下部(左侧)对于任何汽车的正常使用都是至关重要的。 it's when you're driving around town, entering parking lots, etc. This particular part has a tendency to develop a dip around 15° opening and I wish to use the program to quantify this "dip" in the curve and rely less upon the tester's intuition.
这是当您在城镇周围开车,进入停车场等时。该特定部分趋向于在15°的开度附近出现倾角,我希望使用该程序来量化曲线中的“倾角”,而不再依赖于测试人员的直觉。 In the above example, the plot dips but doesn't return to what an ideal line might be.
在上面的示例中,地势有所下降,但并未恢复到理想的水平。
Even though this is an embedded application, printing the report takes 10 seconds, thus I do not consider stepping through an array of 120 points of data multiple times a waste of cycles. 即使这是一个嵌入式应用程序,打印报告也需要10秒钟,因此我不认为多次遍历120个点的数据数组会浪费周期。 Also, since I'm using a uC32 PIC32 microcontroller , there's plenty of memory, so I have the luxury of being able to ponder this problem within the controller.
另外,由于我使用的是uC32 PIC32微控制器 ,因此有足够的内存,因此我能够在控制器内考虑此问题,这是我的荣幸 。
Array of rise between test points: I dismiss the X-axis entirely, considering it unitized, and then make an array of change from one reading to the next. 测试点之间的上升阵列:考虑到X轴是统一的,我将X轴完全消除,然后从一个读数到下一个读数进行一系列变化。 This array is what contributes to the report's "Min rise between points: 0 Max: 14".
此数组有助于报表的“点之间的最小上升:0最大:14”。 I call this array deltas .
我称这个数组为deltas 。
I've tried using standard deviation on deltas , however, during testing I have found that a low Std Dev is not a reliable measure for this part. 我尝试对增量使用标准偏差 ,但是,在测试期间,我发现对于这部分,低标准偏差不是可靠的度量。 If the dip quickly returns to the original line implied by early data points, the Std Dev can be deceptively low (observed to be as low as 2.3) but the part is still something I wouldn't want to use.
如果跌落很快返回到早期数据点所暗示的原始线,则Std Dev可以说是低的(观察到低至2.3),但是该部分仍然是我不想使用的东西。 I tried setting a cutoff at 2.6, but it failed too many parts with great plots.
我尝试将临界值设置为2.6,但由于积压大,它使太多零件失效。 The other, more linear part linked to above can reliably count on Std Dev for quality.
上面链接的另一个更线性的部分可以可靠地依靠Std Dev来保证质量。
Kurtosis seems not to apply for this situation at all. 峰态似乎根本不适合这种情况。 I learned of Kurtosis today and found a Statistics Library which includes Kurtosis and Skewness.
我今天了解了峰度 ,并找到了一个包含峰度和偏度的统计资料库 。 During continued testing, I found that of these two measures, there was not a trend of positive, negative, or amplitude which would correspond to either passing or failing.
在继续测试期间,我发现在这两个度量中,没有与通过或失败相对应的正,负或振幅趋势。 That same gentleman has shared a linear regression library, but I believe Lin Reg is unrelated to my situation, as I am comfortable with the assumption of the AVG of deltas being my ideal line.
这位先生共享一个线性回归库,但是我相信Lin Reg与我的情况无关,因为我很满意地认为增量 AVG是我的理想选择。 Linear Regression and R^2 are more for finding a line from less ideal data or much larger sets.
线性回归和R ^ 2更适合从不太理想的数据或更大的数据集中找到一条线。
Comparing each delta to AVG and Std Dev I set up a monitor to check each delta against final average of the deltas 's data. 每个三角洲AVG和标准偏差比较我设置了一个监控器所要检查的增量的数据的最终平均每个增量。 Here, too, I couldn't find a reliable metric.
在这里,我也找不到可靠的指标。 Too many good parts would not pass a test restricting any delta to within 2x Std Dev away from the Average.
太多合格零件将无法通过将任何差值限制在偏离平均值2倍标准偏差之内的测试。 Ultimately, the only variation from AVG I could settle on is to be within
AVG+Std Dev
difference from the AVG itself. 最终,我可以确定的与AVG唯一的区别是与AVG本身的差异在
AVG+Std Dev
之内。 Anything more restrictive would fail otherwise good parts. 任何更严格的限制都会使本来不错的部分失败。 And the elusive dip around 15° opening can sneak through this test.
大约15°的开度难以捉摸,可以通过此测试。
Homemade dip detector When feeding deltas to the serial monitor of the computer, I observed consecutive negative deltas during the dip, so I programmed in a dip detector, but it feels very crude to me. 自制的倾角检测器当将增量提供给计算机的串行监视器时,我在倾角期间观察到连续的负增量 ,因此我在倾角检测器中进行了编程,但对我来说却非常粗糙。 If there are 5 or more negative deltas in a row, I sum them.
如果连续存在5个或更多的负增量 ,则将它们相加。 I have seen that if I take that sum the dip's differences from AVG then divide by the number of negative deltas, a value over 2.9 or 3 could mean a fail.
我已经看到,如果我将这个总和与AVG的骤降之差相除,然后除以负增量的数量,则大于2.9或3的值可能表示失败。 I have observed dips lasting from 6 to 15 deltas.
我观察到跌幅持续6至15个三角洲。 Readily observable dips would have their differences from AVG sum up to -35.
容易观察到的下探值与AVG的总和之间的差值最多为-35。
Trending accumulated variation from the AVG The above made me think watching the summation of deltas as it wanders away from AVG could be the answer. AVG累积变化趋势的趋势上面的内容使我认为,观察偏离AVG的增量总和可能是答案。 Meaning, I step through the array and sum the differences of each delta from AVG.
意思是,我遍历数组并求和AVG中每个增量的差。 I thought I was on to something until a good part blew this theory.
我以为我一直在研究某件事,直到很大一部分使这个理论破灭。 I was seeing a trend of the fewer times the running sum varied from
AVG
by less than 2x AVG
, the more straight the line appeared. 我看到一种趋势,即运行总和与
AVG
小于AVG
的2x AVG
倍,则直线显示得越直。 Many ideal parts would only show 8 or less delta points where the sumOfDiffs
would stray from the AVG very far. 许多理想零件只会显示8个或更少的增量点,而
sumOfDiffs
将偏离AVG很远。
float sumOfDiffs=0.0;
for( int idx=0; idx<stop; idx++ ){
float spread = deltas[idx] - line->AdcAvgRise;
sumOfDiffs = sumOfDiffs + spread;
...
testVal = 2*line->AdcAvgRise;
if( sumOfDiffs > testVal || sumOfDiffs < -testVal ){
flag = 'S';
}
...
}
And then a part with a fantastic linear plot came through with 58 data points where sumOfDiffs
was more than twice the AVG! 然后,通过58个数据点
sumOfDiffs
了具有奇妙线性图的零件,其中sumOfDiffs
是AVG的两倍以上! I find this amazing, as at the end of the ~120 data points, sumOfDiffs
value is -0.000057. 我发现这很棒,因为在
sumOfDiffs
个数据点的末尾, sumOfDiffs
值为-0.000057。
During testing, the final sumOfDiffs
result would often register as 0.000000 and only on exceptionally bad parts would it be greater than .000100. 在测试期间,最终的
sumOfDiffs
结果通常会记录为0.000000,并且只有在异常情况下,该结果才大于.000100。 I found this quite surprising, actually: how a "bad part" can have accumulated great accuracy. 实际上,我发现这很令人惊讶:“不良零件”如何积累出很高的准确性。
Sample output from monitoring sumOfDiffs This below output shows a dip happening. 监视sumOfDiffs的样本输出此下面的输出显示发生了下降。 The test watches as the running
sumOfDiffs
is more than 2x the AVG away from the AVG for the whole test. 该测试
sumOfDiffs
,在整个测试过程中,运行的sumOfDiffs
距离AVG的距离是AVG的2 sumOfDiffs
以上。 This dip lasts from deltas idx
of 23 through 49; 此倾角持续从增量
idx
的23至49; starts at 17.25° and lasts for 19.5°. 从17.25°开始,持续19.5°。
Avg rise: 6.75 Std dev: 2.577
idx: delta diff from avg sumOfDiffs Flag
23: 5 -1.75 -14.05 S
24: 6 -0.75 -14.80 S
25: 7 0.25 -14.55 S
26: 5 -1.75 -16.30 S
27: 3 -3.75 -20.06 S
28: 3 -3.75 -23.81 S
29: 7 0.25 -23.56 S
30: 4 -2.75 -26.31 S
31: 2 -4.75 -31.06 S
32: 8 1.25 -29.82 S
33: 6 -0.75 -30.57 S
34: 9 2.25 -28.32 S
35: 8 1.25 -27.07 S
36: 5 -1.75 -28.82 S
37: 15 8.25 -20.58 S
38: 7 0.25 -20.33 S
39: 5 -1.75 -22.08 S
40: 9 2.25 -19.83 S
41: 10 3.25 -16.58 S
42: 9 2.25 -14.34 S
43: 3 -3.75 -18.09 S
44: 6 -0.75 -18.84 S
45: 11 4.25 -14.59 S
47: 3 -3.75 -16.10 S
48: 8 1.25 -14.85 S
49: 8 1.25 -13.60 S
Final Sum of diffs: 0.000030
RunningStats analysis:
NumDataValues= 125
Mean= 6.752
StandardDeviation= 2.577
Skewness= 0.251
Kurtosis= -0.277
Sobering note about quality: what started me on this journey was learning how major automotive OEM suppliers consider a 4 point test to be the standard measure for these parts. 关于质量的清晰提示:我踏上这一旅程的第一步是学习主要的汽车OEM供应商如何将4点测试作为这些零件的标准度量。 My first test bench used an Arduino with 8k of RAM, didn't have a TFT display nor a printer, and a mechanical resolution of only 3°!
我的第一个测试台使用的是Arduino,内存为8k,没有TFT显示屏,也没有打印机,机械分辨率仅为3°! Back then I simply tested deltas being within arbitrary total bounds and choosing a limit of how big any single delta could be.
那时,我只是简单地测试了增量在任意总范围内,并选择了单个增量可能有多大的限制。 My 120+ point test feels high class compared to that 30 point test from before, but that test had no idea about these dips.
与之前的30点测试相比,我的120分以上的测试感觉上乘,但是该测试对这些下降并不了解。
Y_dev = Y_data - Y_straight
that is mathematically the same) with this procedure: Y_dev = Y_data - Y_straight
在数学上相同的Y_dev = Y_data - Y_straight
):
PositiveMax = 0; NegativeMax = 0;
tmp_Area
tmp_Area
tmp_Area = Y_dev;
tmp_Area = Y_dev;
复位累加器tmp_Area = Y_dev;
to the current value starting this way a new accumulation Y_dev
is lower than the thereshold you do not accumulate it. Y_dev
值小于阈值,则不会累加该值。 It turns out the result of my gut feeling and Diego's method is an average of the integral. 事实证明,这是我的直觉的结果,而Diego的方法是积分的平均值。 I still don't like that name, so I have described the algorithm and have asked on Math.SE what to call this, which got migrated to "Cross Validated", Stats.SE .
我仍然不喜欢该名称,因此我已经描述了算法,并在Math.SE上询问了该名称的含义,该名称已迁移到“交叉验证”(Stats.SE) 。
I Updated graphs after a massive edit of my Math.SE question. 在对Math.SE问题进行大量编辑之后,我更新了图表。 It turns out I'm taking the average of a closed integral of the derivative of the data.
事实证明,我正在获取数据导数的闭合积分的平均值。 :P First, we gather the data:
:P首先,我们收集数据:
Next is the "derivative": step through the original data array to form the deltas array which is the rise of ADC values from one 0.75° step to the next. 接下来是“导数”:逐步遍历原始数据数组以形成增量数组,这是ADC值从一个0.75°步进到下一个步进的增量 。 "Rise" or "slope" is what the derivative is: dy/dx.
“上升”或“斜率”是派生词:dy / dx。
With the "slope" or average leveled out, I can find multiple negative deltas in a row, sum them, then divide by the count at the end of the dip. 在“斜率”或平均水平趋于平稳的情况下,我可以连续找到多个负增量 ,将其相加,然后在下降结束时除以计数。 The sum is an integral of the area between average and the deltas and when the dip goes back positive, I can divide the sum by the count of the dips.
总和是平均值和增量之间面积的整数,当下降幅度返回正值时,我可以将总和除以下降幅度的计数。
During testing, I came up with a cutoff value for this average of the integral at 2.6. 在测试过程中,我得出了该平均值的平均值的截止值2.6。 That was a great measure of my "gut instinct" looking at the plot thinking a part was good or bad.
这是衡量我的“直觉”来衡量部分认为好坏的好方法。
In case someone else finds themselves trying to quantify this, here's the code I implemented. 万一有人发现自己试图对此进行量化,这是我实现的代码。 Note that it is only looking for negative dips.
请注意,它只是在寻找负跌幅。 Also, dipCountLimit is defined elsewhere as 5. In addition to the dip detector/accumulator (ie Numerical Integrator) I also have a spike detector that arbitrarily flags the test as bad if any data points stray from the average by the amount of average + standard deviation.
另外,在其他地方将dipCountLimit定义为5。除了倾角检测器/累加器(即,数值积分器)之外,我还具有一个尖峰检测器,如果有任何数据点偏离平均值+平均值+标准,则可以任意将测试标记为不良。偏差。 AVG+STD DEV as a spike limit was chosen arbitrarily based on the observed plots of the parts it would fail.
根据观察到的可能会损坏的零件图,任意选择AVG + STD DEV作为峰值限制。
int dipdx=0;
// inDipFlag also counts the length of this dip
int inDipFlag=0;
float dips[140] = { 0.0 };
for( int idx=0; idx<stop; idx++ ){
const float diffFromAvg = deltas[idx] - line->AdcAvgRise;
// state machine to monitor dips
const int _stop = stop-1;
if( diffFromAvg < 0 && idx < _stop ) {
// check NEXT data point for negative diff & set dipFlag to put state in dip
const float nextDiff = deltas[idx+1] - line->AdcAvgRise;
if( nextDiff < 0 && inDipFlag == 0 )
inDipFlag = 1;
// already IN a dip, and next diff is negative
if( nextDiff < 0 && inDipFlag > 0 ) {
inDipFlag++;
}
// accumulate this dip
dips[dipdx]+= diffFromAvg;
// next data point ends this dip and we advance dipdx to next dip
if( inDipFlag > 0 && nextDiff > 0 ) {
if( inDipFlag < dipCountLimit ){
// reset the accumulator, do not advance dipdx to next entry
dips[dipdx]=0.0;
} else {
// change this entry's value from dip sum to its ratio
dips[dipdx] = -dips[dipdx]/inDipFlag;
// advance dipdx to next entry
dipdx++;
}
// Next diff isn't negative, so the dip is done
inDipFlag = 0;
}
}
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.