简体繁体 English

没有神经网络的手机触摸手势识别

[英]Touch gesture recognition for phones without neural networks

原文 2011-10-02 22:31:47 7 2 c#/ gesture-recognition

I'm developing a gesture recognition program for a phone. 我正在为手机开发手势识别程序。 What I am trying to accomplish is for users to draw their own "patterns", and then afterwards have these patterns do different things. 我想要完成的是让用户绘制自己的“模式”，然后让这些模式做不同的事情。

Storing the pattern - the "pattern save" algorithm as I call it 存储模式 - 我称之为“模式保存”算法

This when the gesture is originally being drawn and recorded. 这是在最初绘制和记录手势时。 This is also the algorithm I use for grabbing what the user draws, and to use it for comparison: 这也是我用来抓取用户绘制内容并将其用于比较的算法：

The user starts drawing his pattern. 用户开始绘制他的模式。 For every 15 pixels, a point is placed in a list referred to as "the list". 对于每15个像素，将一个点放在称为“列表”的列表中。
Once the pattern has been drawn, the first and last point is removed from the list. 绘制完模式后，将从列表中删除第一个和最后一个点。
For each of the points now in the list, their connections are converted into a direction enumeration (containing 8 directions) which is then added to a list as well, now referred to as "the list". 对于列表中现在的每个点，它们的连接被转换为方向枚举（包含8个方向），然后将其添加到列表中，现在称为“列表”。
Filter 1 begins, going through 3 directions at a time in the list. 过滤器1开始，在列表中一次经过3个方向。 If the left direction is the same as the right direction, the middle direction is removed. 如果左方向与右方向相同，则移除中间方向。
Filter 2 begins, removing duplicate directions. 过滤器2开始，删除重复的方向。
Filter 3 begins, removing assumed noise. 滤波器3开始，消除假定的噪声。 Assumed noise is detected by pairs of duplicate directions occuring again and again. 通过一次又一次地发生的重复方向检测到假定的噪声。 (as an example, "left upper-left left upper-left" is being turned into "upper-left" or "left"). （例如，“左上左左上”正在变成“左上”或“左”）。
Filter 4 begins, removing even more assumed noise. 滤波器4开始，消除更多的假定噪声。 Assumed noise is this time detected by (again) comparing 3 directions at a time in the list as seen in step 4 (Filter 1), but where directions are not checked for being entirely equal, only almost equal (as an example, left is almost equal to "upper-left" and "lower-left"). 假设噪声是通过（再次）在列表中一次比较3个方向来检测的，如步骤4（滤波器1）中所示，但是在未检查方向完全相等的情况下，仅几乎相等（例如，左边是几乎等于“左上”和“左下”）。

The list of directions are now stored in a file. 方向列表现在存储在文件中。 The directions list is saved as the gesture itself, used for comparing it later. 方向列表保存为手势本身，稍后用于比较。

Comparing the pattern 比较模式

Once a user then draws a pattern, the "pattern save" algorithm is used on that pattern as well (but only to filter out noise, not actually saving it, since that would be stupid). 一旦用户绘制了一个模式，那么“模式保存”算法也会用在该模式上（但仅用于滤除噪声，而不是实际保存它，因为那样会很愚蠢）。

This filtered pattern is then compared with all current patterns in the gesture list. 然后将该过滤后的模式与手势列表中的所有当前模式进行比较。 This comparison method is quite complex to describe, and I'm not that good at English as I should be. 这种比较方法描述起来相当复杂，而且我对英语并不擅长。

In short, it goes through the gesture that the user typed in, and for each direction in this gesture, compares with all other gestures directions. 简而言之，它通过用户键入的手势，并且针对该手势中的每个方向，与所有其他手势方向进行比较。 If a direction is similar (as seen in the algorithm above), that's okay, and it continues to check the next direction. 如果方向相似（如上面的算法所示），那没关系，并继续检查下一个方向。 If it's not similar 2 times in a row, it is considered a non-match. 如果它连续两次不相似，则认为是不匹配的。

Conclusion 结论

All of this is developed by myself, since I love doing what I do. 所有这些都是由我自己开发的，因为我喜欢做我做的事情。 I'd love to hear if there are anywhere on the Internet where I can find resources on something similar to what I am doing. 我很想知道互联网上是否有任何地方可以找到类似于我正在做的事情的资源。

I do not want any neural network solutions. 我不想要任何神经网络解决方案。 I want it to be "under control" so to speak, without any training needed. 我希望它可以“受控制”，可以说，无需任何培训。

Some feedback would be great too, and would work as well, if you have any way that I could do the above algorithm better. 如果您有任何方法可以更好地完成上述算法，那么一些反馈也会很好，并且也可以正常工作。

You see, it works fine in some scenarios. 你看，它在某些情况下运行良好。 But for instance, when I make an "M" and an upside-down "V", it can't recognize the difference. 但是，例如，当我制作一个“M”和一个颠倒的“V”时，它无法识别出差异。

Help would be appreciated. 帮助将不胜感激。 Oh, and vote up the question if you think I described everything well! 哦，如果你认为我描述的一切都很好，那就投票了！

2 个解决方案

General ideas 一般想法

wouldn't M and V appear identical because you junk the first and last points? M和V看起来不一样，因为你破坏了第一个和最后一个点？ Junking the first and last points seemed a bit redundant since you operate on directions anyway (a list of three points already leads to a list of only 2 directions). 由于你无论如何都在方向上操作，所以看到第一个和最后一个点似乎有点多余（三个点的列表已经导致只有两个方向的列表）。
Also, I'd recommend just prototyping stuff like this. 另外，我建议只是这样的原型。 You'll find out whether you'll be susceptible to noise (I expect not, due to 'for every 15 pixels'). 你会发现你是否容易受到噪音的影响（我预计不会因为'每15个像素'）。

Re: the comparison stage Re：比较阶段

I think you'll get some more generic ideas to matching 'closely related' movements by reading Peter Norvigs excellent 16-line spellchecker article. 我认为通过阅读Peter Norvigs优秀的16行拼写检查文章，你会得到一些更通用的想法来匹配“密切相关”的动作。 here 这里

You're basically using a Markovian(ish) FSM based on gesture orientations to calculate "closeness" of shapes. 您基本上使用基于手势方向的马尔可夫（ish）FSM来计算形状的“接近度”。 You shouldn't. 你不应该。 An M looks the same whether it's drawn left-to-right or right-to-left. 无论是从左到右还是从右到左绘制， M看起来都是一样的。 (Maybe I misunderstood this detail.) （也许我误解了这个细节。）

You should compare shapes using something like openCV . 你应该使用像openCV这样的东西来比较形状。 In particular, cvMatchShapes() . 特别是cvMatchShapes() 。 This function uses Hu moments (a well-established metric ) to compare "closeness" of binary shapes . 此函数使用Hu矩（一个完善的度量）来比较二进制形状的 “接近度”。 Hu moments are used for comparing protein binding sites and as part of more complicated shape-recognition algorithms like SURF. Hu矩用于比较蛋白质结合位点，并作为更复杂的形状识别算法（如SURF）的一部分。 It should be good enough for what you're trying to do. 它应该足够你想要做的事情。