简体   繁体   English

有没有办法在javascript中使用正则表达式来实现高效准确的搜索功能

[英]Is there a way to use regex in javascript to implement an efficient and accurate search functionality

I have been working on a project that requires its users to search for products something like a Walmart or Amazon search functionality.我一直在从事一个项目,该项目要求其用户搜索诸如沃尔玛或亚马逊搜索功能之类的产品。 But it seems anytime I feel like i have a working solution I face another problem.但似乎每当我觉得我有一个可行的解决方案时,我就会面临另一个问题。 Here is my current code snippet.这是我当前的代码片段。 I will explain what the code does below.我将在下面解释代码的作用。

const keywords = [
  'oranges',
  'red bull',
  'red bull energy drink',
  'carnation breakfast essentials',
  'carnation instant breakfast',
  'organic oranges',
  'oargens',
  'nesquik powder',
  "welch's orange juice",
  'mandarin oranges',
  'bananas',
  'nesquik chocolate powder',
  'jimmy dean sausage',
  'organic bananas',
  'nesquik',
  'nesquik no sugar',
  "welch's white grape",
  'great value',
  'great value apple juice',
  'lemon',
  'lemon fruit',
  'avocados',
  'avocados, each',
  'apple juice',
];

class SearchApi {
  constructor(keywords, query) {
    this.keywords = keywords;
    this.query = query;
  }

  findWithRegex(keyword, query) {
    const pattern = query
      .split('')
      .map(q => {
        return `(?=.*${q})`;
      })
      .join('');

    const regex = new RegExp(`${pattern}`, 'g');

    return keyword.match(regex);
  }

  matchKeywords() {
    const str = this.query.trim().toLowerCase().substring(0, 3);
    const queryLength = this.query.trim().length;

      return this.keywords.filter(keyword => {
        const keywordSubstr = keyword.substring(0, 3);
        const equalInitials = keyword.substring(0, 1) === this.query.toLowerCase().substring(0, 1);

        return this.findWithRegex(keywordSubstr, str) && equalInitials && this.findWithRegex(keyword.substring(queryLength, queryLength - 3), this.query.trim().substring(queryLength, queryLength - 3));
    });
  }
}

const searchApi = new SearchApi(keywords, 'organic banan');
searchApi.matchKeywords();

Code Explained代码解释

What I am basically doing here is when a query is made, I compare the first and last three characters of the query and keyword and also check if the initials in the query and keyword are the same because if someone types the letter " o " I want to show only keywords that begin with that letter.我在这里基本上做的是在进行查询时,我比较查询和关键字的前三个字符和最后三个字符,并检查查询和关键字中的首字母是否相同,因为如果有人键入字母“ o ”我只想显示以该字母开头的关键字。

It works fine but unfortunately while testing when I type " organic banan " as the query I get ["organic oranges", "organic bananas"] which should be ["organic bananas"] .它工作正常,但不幸的是,而当我输入“有机巴南”作为查询我得到测试["organic oranges", "organic bananas"]应该是["organic bananas"]

This is because the regex function finds the characters " a " and " n " in the last three letters of organic oranges using.这是因为 regex 函数在使用的有机橙子的最后三个字母中查找字符“ a ”和“ n ”。 Any further assistance from here on how to do this efficiently will be helpful to me.关于如何有效地做到这一点,这里的任何进一步帮助都会对我有所帮助。

Search-as-you-type or auto-complete features are usually implemented with specialized data structures and algorithms but not regex (unless your search feature is all about regex searching..) Search-as-you-type 或自动完成功能通常是用专门的数据结构和算法实现的,而不是 regex(除非你的搜索功能都是关于 regex 搜索..)

You may want to use a binary search to search strings in an array as shown below (I've selected this function below for demonstration purposes only and do not recommend it. Source ).您可能希望使用二分搜索来搜索数组中的字符串,如下所示(我在下面选择此函数仅用于演示目的,不推荐使用。 来源)。 You will find probably many packages that fit your environment and needs, eg fast-string-search which is using N-API and boyer-moore-magiclen to make things fast.您可能会找到许多适合您的环境和需求的软件包,例如使用 N-API 和 boyer-moore-magiclen加快速度的快速字符串搜索

In terms of data structures, prefix trees (TRIE) are often suggested and used to implement fast autocomplete features.在数据结构方面,经常建议使用前缀树 (TRIE) 来实现快速自动完成功能。 Here is a simple implementation that shows the basic concept of a TRIE. 是一个简单的实现,展示了 TRIE 的基本概念。

 var movies = [ "ACADEMY DINOSAUR", "ACE GOLDFINGER", "ADAPTATION HOLES", "AFFAIR PREJUDICE", "BENEATH RUSH", "BERETS AGENT", "BETRAYED REAR", "BEVERLY OUTLAW", "BIKINI BORROWERS", "YENTL IDAHO", "YOUNG LANGUAGE", "YOUTH KICK", "ZHIVAGO CORE", "ZOOLANDER FICTION", "ZORRO ARK" ]; var searchBinary = function (needle, haystack, case_insensitive) { if (needle == "") return []; var haystackLength = haystack.length; var letterNumber = needle.length; case_insensitive = (typeof (case_insensitive) === 'undefined' || case_insensitive) ? true : false; needle = (case_insensitive) ? needle.toLowerCase() : needle; /* start binary search, Get middle position */ var getElementPosition = findElement() /* get interval and return result array */ if (getElementPosition == -1) return []; return getRangeElement = findRangeElement() function findElement() { if (typeof (haystack) === 'undefined' || !haystackLength) return -1; var high = haystack.length - 1; var low = 0; while (low <= high) { mid = parseInt((low + high) / 2); var element = haystack[mid].substr(0, letterNumber); element = (case_insensitive) ? element.toLowerCase() : element; if (element > needle) { high = mid - 1; } else if (element < needle) { low = mid + 1; } else { return mid; } } return -1; } function findRangeElement() { for (i = getElementPosition; i > 0; i--) { var element = (case_insensitive) ? haystack[i].substr(0, letterNumber).toLowerCase() : haystack[i].substr(0, letterNumber); if (element != needle) { var start = i + 1; break; } else { var start = 0; } } for (i = getElementPosition; i < haystackLength; i++) { var element = (case_insensitive) ? haystack[i].substr(0, letterNumber).toLowerCase() : haystack[i].substr(0, letterNumber); if (element != needle) { var end = i; break; } else { var end = haystackLength - 1; } } var result = []; for (i = start; i < end; i++) { result.push(haystack[i]) } return result; } }; testBinary = searchBinary("BIKINI", movies, false); console.log('searchBinary("BIKINI", movies, false) = [' + testBinary + ']'); testBinary = searchBinary("b", movies, false); console.log('searchBinary("b", movies, false) = [' + testBinary + ']'); testBinary = searchBinary("b", movies, true); console.log('searchBinary("b", movies, true) = [' + testBinary + ']');

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM