简体   繁体   English

Levenshtein仅在字符串的一部分上的距离(Java)

[英]Levenshtein Distance on only part of a string (Java)

I have an online web application with a top menu tree for opening different widgets for performing different tasks. 我有一个带有顶级菜单树的在线Web应用程序,用于打开不同的小部件以执行不同的任务。 As the app grows more powerful, that tree has become large and difficult to navigate. 随着应用程序变得越来越强大,该树变得越来越大并且难以导航。 I've implemented a search feature, where users can just type the menu name or part of it and I use regex to find all items in the menu tree that match what the user types. 我已经实现了一个搜索功能,用户只需键入菜单名称或部分菜单名称,我就可以使用正则表达式查找菜单树中与用户输入内容相匹配的所有项目。 My regex allows for partial words and swapped words, and also limits the search to the beginning of each word. 我的正则表达式允许部分单词和交换单词,并且还将搜索限制在每个单词的开头。 The one thing it doesn't allow for is misspelled words. 它不允许的一件事是拼写错误的单词。 I understand that to allow for misspelled words it's best not to use regex and to use a string distance method instead, but I still want to allow for the partial word and swapped words. 我理解为了允许拼写错误的单词最好不要使用正则表达式而是使用字符串距离方法,但我仍然想要允许部分单词和交换单词。 Is this possible? 这可能吗?

For example, right now, if a menu item is "Finance Rate Maintenance", any of the following would match to that menu item: "finance", "finance ra", "rate finance" etc.. "inance rate" would not match because "inance" does not appear at the beginning of any of the words for that menu item. 例如,现在,如果菜单项是“财务费率维护”,则以下任何一项都将匹配该菜单项:“财务”,“财务ra”,“费率财务”等。“inance rate”不会匹配,因为“inance”不会出现在该菜单项的任何单词的开头。 I want searches like "fnane rate" and "rate maintainance" which are slightly misspelled to match. 我希望像“fnane rate”和“rate maintenanceance”这样的搜索稍有拼写错误。

I would just attach a list of words to each option, and simultaneously maintain a dictionary with all of the words in it. 我只是在每个选项中附加一个单词列表,同时维护一个包含所有单词的字典。 Then, when the user types in their query, the program would check that every word that they enter is in the dictionary. 然后,当用户键入他们的查询时,程序将检查他们输入的每个单词是否在字典中。 If one is not, it would find the closest word via. 如果不是,它会找到最接近的单词via。 string distance and correct the word. 字符串距离并更正单词。

Finally, it would could suggest the menu option with the most words in common with the corrected input words. 最后,它可以建议菜单选项与校正的输入单词具有最多的单词。

A good example of a spelling corrector (in python though) is at http://norvig.com/spell-correct.html 拼写校正器(在python中)的一个很好的例子是http://norvig.com/spell-correct.html

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM