简体   繁体   English

通过大型js字符串数组优化搜索?

[英]optimize search through large js string array?

if I have a large javascript string array that has over 10,000 elements, how do I quickly search through it? 如果我有一个超过10,000个元素的大型javascript字符串数组,我该如何快速搜索它?

Right now I have a javascript string array that stores the description of a job, and I"m allowing the user to dynamic filter the returned list as they type into an input box. 现在我有一个javascript字符串数组,用于存储作业的描述,并且我允许用户在输入框中输入时动态过滤返回的列表。

So say I have an string array like so: 所以说我有一个像这样的字符串数组:
var descArr = {"flipping burgers", "pumping gas", "delivering mail"};

and the user wants to search for: "p" 并且用户想要搜索: "p"

How would I be able to search a string array that has 10000+ descriptions in it quickly? 我怎样才能快速搜索其中包含10000多条描述的字符串数组? Obviously I can't sort the description array since they're descriptions, so binary search is out. 显然我不能对描述数组进行排序,因为它们是描述,因此二进制搜索已经完成。 And since the user can search by "p" or "pi" or any combination of letters, this partial search means that I can't use associative arrays (ie searchDescArray["pumping gas"] ) to speed up the search. 并且由于用户可以通过"p""pi"或任何字母组合进行搜索,因此该部分搜索意味着我不能使用关联数组(即searchDescArray["pumping gas"] )来加速搜索。

Any ideas anyone? 任何人的想法?

As regular expression engines in actual browsers are going nuts in terms of speed, how about doing it that way? 由于实际浏览器中的正则表达式引擎在速度方面变得疯狂,如何这样做呢? Instead of an array pass a gigantic string and separate the words with an identifer. 而不是数组传递一个巨大的字符串,并用一个标识符分隔单词。 Example: 例:

  • String "flipping burgers""pumping gas""delivering mail" 字符串"flipping burgers""pumping gas""delivering mail"
  • Regex: "([^"]*ping[^"]*)" 正则表达式: "([^"]*ping[^"]*)"

With the switch /g for global you get all the matches. 使用switch /g for global,您可以获得所有匹配项。 Make sure the user does not search for your string separator. 确保用户不搜索字符串分隔符。

You can even add an id into the string with something like: 你甚至可以在字符串中添加一个id,例如:

  • String "11 flipping burgers""12 pumping gas""13 delivering mail" 字符串"11 flipping burgers""12 pumping gas""13 delivering mail"
  • Regex: "(\\d+) ([^"]*ping[^"]*)" 正则表达式: "(\\d+) ([^"]*ping[^"]*)"

  • Example: http://jsfiddle.net/RnabN/4/ (30000 strings, limit results to 100) 示例: http//jsfiddle.net/RnabN/4/ (30000个字符串,将结果限制为100)

There's no way to speed up an initial array lookup without making some changes. 如果不做一些更改,就无法加速初始数组查找。 You can speed up consequtive lookups by caching results and mapping them to patterns dynamically. 您可以通过缓存结果并动态地将它们映射到模式来加速连续查找。

1.) Adjust your data format. 1.)调整您的数据格式。 This makes initial lookups somewhat speedier. 这使初始查找更快一些。 Basically, you precache. 基本上,你预先安排。

var data = {
    a : ['Ant farm', 'Ant massage parlor'],
    b : ['Bat farm', 'Bat massage parlor']
    // etc
}

2.) Setup cache mechanics. 2.)设置缓存机制。

var searchFor = function(str, list, caseSensitive, reduce){
    str = str.replace(/(?:^\s*|\s*$)/g, ''); // trim whitespace
    var found = [];
    var reg = new RegExp('^\\s?'+str, 'g' + caseSensitive ? '':'i');
    var i = list.length;
    while(i--){
        if(reg.test(list[i])) found.push(list[i]);
        reduce && list.splice(i, 1);
    }
}

var lookUp = function(str, caseSensitive){
    str = str.replace(/(?:^\s*|\s*$)/g, ''); // trim whitespace
    if(data[str]) return cache[str];
    var firstChar = caseSensitive ? str[0] : str[0].toLowerCase();
    var list = data[firstChar];
    if(!list) return (data[str] = []);
    // we cache on data since it's already a caching object.
    return (data[str] = searchFor(str, list, caseSensitive)); 
}

3.) Use the following script to create a precache object. 3.)使用以下脚本创建一个precache对象。 I suggest you run this once and use JSON.stringify to create a static cache object. 我建议你运行一次并使用JSON.stringify来创建一个静态缓存对象。 (or do this on the backend) (或在后端执行此操作)

// we need lookUp function from above, this might take a while
var preCache = function(arr){
    var chars = "abcdefghijklmnopqrstuvwxyz".split('');
    var cache = {};
    var i = chars.length;
    while(i--){
        // reduce is true, so we're destroying the original list here.
        cache[chars[i]] = searchFor(chars[i], arr, false, true);
    }
    return cache;
}

Probably a bit more code then you expected, but optimalisation and performance doesn't come for free. 可能会比您预期的更多代码,但优化和性能不是免费的。

This may not be an answer for you, as I'm making some assumptions about your setup, but if you have server side code and a database, you'd be far better off making an AJAX call back to get the cut down list of results, and using a database to do the filtering (as they're very good at this sort of thing). 这对你来说可能不是一个答案,因为我对你的设置做了一些假设,但是如果你有服务器端代码和数据库,那么你最好再做一个AJAX回调以获得减少的列表结果,并使用数据库进行过滤(因为他们非常擅长这种事情)。

As well as the database benefit, you'd also benefit from not outputting this much data (10000 variables) to a web based front end - if you only return those you require, then you'll save a fair bit of bandwidth. 除了数据库的好处之外,您还可以从不向基于Web的前端输出这么多数据(10000个变量)中受益 - 如果您只返回所需的数据,那么您将节省相当多的带宽。

I can't reproduce the problem, I created a naive implementation, and most browsers do the search across 10000 15 char strings in a single digit number of milliseconds. 我无法重现这个问题,我创建了一个天真的实现,并且大多数浏览器在一个数字毫秒内搜索10000个15字符串。 I can't test in IE6, but I wouldn't believe it to more than 100 times slower than the fastest browsers, which would still be virtually instant. 我不能在IE6中测试,但我不相信它比最快的浏览器慢100多倍,这仍然几乎是即时的。

Try it yourself: http://ebusiness.hopto.org/test/stacktest8.htm (Note that the creation time is not relevant to the issue, that is just there to get some data to work on.) 亲自尝试一下: http//ebusiness.hopto.org/test/stacktest8.htm (请注意,创建时间与问题无关,只是为了获取一些数据。)

One thing you could do wrong is trying to render all results, that would be quite a huge job when the user has only entered a single letter, or a common letter combination. 你可能做错的一件事是尝试渲染所有结果,当用户只输入一个字母或一个普通的字母组合时,这将是一项相当大的工作。

I suggest trying a ready made JS function, for example the autocomplete from jQuery. 我建议尝试一个现成的JS函数,例如jQuery的autocomplete It's fast and it has many options to configure. 它很快,它有很多配置选项。

Check out the jQuery autocomplete demo 查看jQuery自动完成演示

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM