简体   繁体   English

编程机器学习,比较两条绘制的线与 xy 坐标

[英]Programming machine learning, compare two plotted lines with x y coordinates

So I have multiple paths stored, each path would consist of data points x1,y1 |所以我存储了多个路径,每个路径都包含数据点 x1,y1 | x2, y2 | x2, y2 | x3, y3 ... etc x3, y3 ... 等

I would like to compare these paths with one another to work out if any similarities are present.我想将这些路径相互比较,以确定是否存在任何相似之处。

I could run through each point and see if it matched any of the points in the first path, then look to see if the next point matches the next point.我可以遍历每个点并查看它是否与第一条路径中的任何点匹配,然后查看下一个点是否与下一个点匹配。

I think this would work if there were no anomalies, but could skip over if the next point did not match.我认为如果没有异常,这会起作用,但如果下一点不匹配,则可以跳过。

I would like to build in some level of tolerance eg 10, 10 may match 12, 12 or 8, 8我想建立某种程度的容忍度,例如 10、10 可能匹配 12、12 或 8、8

Is this a good way to compare the data, or is there a better approach?这是比较数据的好方法,还是有更好的方法?

As a second step I may want to consider time as a value too, so each point would have a time value associated with it.作为第二步,我可能也想将时间视为一个值,因此每个点都会有一个与之相关的时间值。

Some possible approaches you can use:您可以使用的一些可能的方法:

  1. handle booth paths as polygon and compare them as such将展位路径作为多边形处理并进行比较

    see: How to compare two shapes?请参阅:如何比较两个形状?

  2. use OCR algorithms/approaches使用 OCR 算法/方法

    see: OCR and character similarity请参阅: OCR 和字符相似度

  3. transform both paths to synchronized dataset and correlate将两个路径转换为同步数据集并关联

    either extract significant points only and/or resample paths to the same point count.仅提取重要点和/或将路径重新采样到相同的点数。 Then synchronize booth datasets (as in bullet 1) and use correlation coefficient然后同步展位数据集(如项目符号 1 中所示)并使用相关系数

[notes] [笔记]

Depending on the input data you can also exploit DCT/DFT transforms to remove unimportant data (like in JPG compression) And or compare in frequency domain instead of spatial/time domain.根据输入数据,您还可以利用DCT/DFT变换来删除不重要的数据(如 JPG 压缩)和或在频域而不是空间/时域中进行比较。

You can also compare obvious things (invariant on rotation and translation) like您还可以比较明显的事物(旋转和平移不变),例如

  1. area区域
  2. perimeter length周长
  3. number of self-intersections自交的数量
  4. number of inflex points拐点数

u could compare the mean and variances of the two set of points.你可以比较两组点的均值和方差。 If they are on straight lines, as you hypothesize, you could fit straight lines through the two datasets and then compare the parameters of the two straight lines to infer about their distances.如果它们在直线上,如您假设的那样,您可以通过两个数据集拟合直线,然后比较两条直线的参数以推断它们的距离。 It would be more helpful if you could tell the behavour of the two datasets.如果你能说出这两个数据集的行为会更有帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM