简体   繁体   中英

Touch gesture recognition for phones without neural networks

I'm developing a gesture recognition program for a phone. What I am trying to accomplish is for users to draw their own "patterns", and then afterwards have these patterns do different things.

Storing the pattern - the "pattern save" algorithm as I call it

This when the gesture is originally being drawn and recorded. This is also the algorithm I use for grabbing what the user draws, and to use it for comparison:

  1. The user starts drawing his pattern. For every 15 pixels, a point is placed in a list referred to as "the list".
  2. Once the pattern has been drawn, the first and last point is removed from the list.
  3. For each of the points now in the list, their connections are converted into a direction enumeration (containing 8 directions) which is then added to a list as well, now referred to as "the list".
  4. Filter 1 begins, going through 3 directions at a time in the list. If the left direction is the same as the right direction, the middle direction is removed.
  5. Filter 2 begins, removing duplicate directions.
  6. Filter 3 begins, removing assumed noise. Assumed noise is detected by pairs of duplicate directions occuring again and again. (as an example, "left upper-left left upper-left" is being turned into "upper-left" or "left").
  7. Filter 4 begins, removing even more assumed noise. Assumed noise is this time detected by (again) comparing 3 directions at a time in the list as seen in step 4 (Filter 1), but where directions are not checked for being entirely equal, only almost equal (as an example, left is almost equal to "upper-left" and "lower-left").

The list of directions are now stored in a file. The directions list is saved as the gesture itself, used for comparing it later.

Comparing the pattern

Once a user then draws a pattern, the "pattern save" algorithm is used on that pattern as well (but only to filter out noise, not actually saving it, since that would be stupid).

This filtered pattern is then compared with all current patterns in the gesture list. This comparison method is quite complex to describe, and I'm not that good at English as I should be.

In short, it goes through the gesture that the user typed in, and for each direction in this gesture, compares with all other gestures directions. If a direction is similar (as seen in the algorithm above), that's okay, and it continues to check the next direction. If it's not similar 2 times in a row, it is considered a non-match.

Conclusion

All of this is developed by myself, since I love doing what I do. I'd love to hear if there are anywhere on the Internet where I can find resources on something similar to what I am doing.

I do not want any neural network solutions. I want it to be "under control" so to speak, without any training needed.

Some feedback would be great too, and would work as well, if you have any way that I could do the above algorithm better.

You see, it works fine in some scenarios. But for instance, when I make an "M" and an upside-down "V", it can't recognize the difference.

Help would be appreciated. Oh, and vote up the question if you think I described everything well!

General ideas

  1. wouldn't M and V appear identical because you junk the first and last points? Junking the first and last points seemed a bit redundant since you operate on directions anyway (a list of three points already leads to a list of only 2 directions).

  2. Also, I'd recommend just prototyping stuff like this. You'll find out whether you'll be susceptible to noise (I expect not, due to 'for every 15 pixels').

Re: the comparison stage

I think you'll get some more generic ideas to matching 'closely related' movements by reading Peter Norvigs excellent 16-line spellchecker article. here

You're basically using a Markovian(ish) FSM based on gesture orientations to calculate "closeness" of shapes. You shouldn't. An M looks the same whether it's drawn left-to-right or right-to-left. (Maybe I misunderstood this detail.)

You should compare shapes using something like openCV . In particular, cvMatchShapes() . This function uses Hu moments (a well-established metric ) to compare "closeness" of binary shapes . Hu moments are used for comparing protein binding sites and as part of more complicated shape-recognition algorithms like SURF. It should be good enough for what you're trying to do.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM