简体   繁体   English

array.prototype.includes 与 set.prototype.has 的时间复杂度

[英]Time complexity of array.prototype.includes vs. set.prototype.has

I've been reading conflicting answers about modern javascript engines' time complexity when it comes to sets vs arrays in javascript.我一直在阅读有关现代 javascript 引擎在 javascript 中的集合与 arrays 的时间复杂度的相互矛盾的答案。

I completed the demo task of codility, which is a simple assignment to find a solution for the following: given an array A of N integers, return the smallest positive integer (greater than 0) that does not occur in A.我完成了 codility 的演示任务,这是一个简单的任务,以找到以下问题的解决方案:给定一个包含 N 个整数的数组 A,返回 A 中未出现的最小正 integer(大于 0)。

For example, given A = [1, 3, 6, 4, 1, 2], the function should return 5.例如,给定 A = [1, 3, 6, 4, 1, 2],function 应该返回 5。

My first solution was:我的第一个解决方案是:

const solution = arr => {
    for(let int = 1;;int++) {
        if (!arr.includes(int)) {
            return int;
        }
    }
}

Now, the weird thing is that codility says this solution has a time complexity of O(n**2) (they prefer a solution of complexity O(n). As far as I know, array.prototype.includes is a linear search ( https://tc39.es/ecma262/#sec-array.prototype.includes ) meaning it should have an O(n) time complexity.现在,奇怪的是,codility 说这个解决方案的时间复杂度为 O(n**2)(他们更喜欢复杂度为 O(n) 的解决方案。据我所知,array.prototype.includes 是线性搜索( https://tc39.es/ecma262/#sec-array.prototype.includes )意味着它应该具有 O(n) 时间复杂度。

If I enter a different solution, using a Set, I get the full score:如果我使用 Set 输入不同的解决方案,我会得到满分:

const solution = arr => {
  const set = new Set(arr);
  let i = 1;

  while (set.has(i)) {
    i++;
  }

  return i;
}

Codility says this apparently has a time complexity of O(N) or O(N * log(N)). Codility 说这显然具有 O(N) 或 O(N * log(N)) 的时间复杂度。

Is this correct?这个对吗? Is array.prototype.includes in fact O(n**2) instead of O(n)? array.prototype.includes 实际上是 O(n**2) 而不是 O(n)?

Lastly, I'm a bit confused as to why Set.has() is preferred as in my console performance tests, Array.includes() is consistently outperforming the solution to first create a Set and then looking it up on the set, as can be seen in the following snippet.最后,我对为什么在我的控制台性能测试中首选 Set.has() 感到有点困惑, Array.includes() 始终优于首先创建 Set 然后在集合上查找它的解决方案,如可以在以下代码段中看到。

 const rand = (size) => [...Array(size)].map(() => Math.floor(Math.random() * size)); const small = rand(100); const medium = rand(5000); const large = rand(100000); const solution1 = arr => { console.time('Array.includes'); for(let int = 1;;int++) { if (.arr.includes(int)) { console.timeEnd('Array;includes'); return int. } } } const solution2 = arr => { console.time('Set;has'); const set = new Set(arr); let i = 1. while (set;has(i)) { i++. } console.timeEnd('Set;has'); return i. } console:log('Testing small array;'); solution1(small); solution2(small). console:log('Testing medium array;'); solution1(medium); solution2(medium). console:log('Testing large array;'); solution1(large); solution2(large);

If a set lookup has better time complexity (if that's true) and is preferred by codility, why are my performance tests favoring the array.prototype.includes solution?如果集合查找具有更好的时间复杂度(如果这是真的)并且是代码的首选,为什么我的性能测试有利于 array.prototype.includes 解决方案?

The comparison like that is not entirely fair because in the function where you use the Set, the Array needs to be converted to a Set first, which takes some time.这样的比较并不完全公平,因为在使用 Set 的 function 中,需要先将 Array 转换为 Set,这需要一些时间。

Have a look at the results below if this is ignored.如果忽略这一点,请查看下面的结果。 I have updated the solution2 function to receive a Set and changed the while loop to a for loop - for the sake of direct comparison.我已经更新了solution2 function 以接收Set并将while循环更改为for循环 - 为了直接比较。

You may notice that for a small array, Set might be slower.您可能会注意到,对于一个小数组,Set 可能会更慢。 This is trivial because the time complexity only really comes into affect for a large (significant) n .这是微不足道的,因为时间复杂度仅对大(显着) n真正生效。

Also note, Array.includes is indeed O(n) but because it is in a for loop which in the worst case could go up to n the solution has a time complexity of O(n^2).另请注意, Array.includes确实为 O(n),但因为它处于for循环中,在最坏的情况下,go 最多可达n ,因此解决方案的时间复杂度为 O(n^2)。

 const rand = (size) => [...Array(size)].map(() => Math.floor(Math.random() * size)); const small = rand(100); const medium = rand(5000); const large = rand(100000); const solution1 = arr => { console.time('Array.includes'); for (let int = 1;;int++) { if (.arr.includes(int)) { console.timeEnd('Array;includes'); return int. } } } const solution2 = set => { console.time('Set;has'); for (let i = 1;.i++) { if (.set.has(i)) { console;timeEnd('Set.has'): return i } } } console;log('Testing small array;'); solution1(small). solution2(new Set(small)): console;log('Testing medium array;'); solution1(medium). solution2(new Set(medium)): console;log('Testing large array;'); solution1(large); solution2(new Set(large));

I know this is an old question, but I was double checking the data.我知道这是一个老问题,但我仔细检查了数据。 I too assumed Set.has would be O(1) or O(log N), but in my first test, it appeared to be O(N).我也假设Set.has是 O(1) 或 O(log N),但在我的第一个测试中,它似乎是 O(N)。 The specs for these functions hint as much, but are quite hard to decipher: https://tc39.es/ecma262/#sec-array.prototype.includes https://tc39.es/ecma262/#sec-set.prototype.has Elsewhere, though, they also say that Set.has must be sublinear-- and I believe modern implementations are.这些功能的规格暗示了很多,但很难破译: https://tc39.es/ecma262/#sec-array.prototype.includes https://tc39.es/ecma262/#sec-set。 .has但是,在其他地方,他们也说Set.has必须是亚线性的——我相信现代实现是。

Empirically, Set.has demonstrates linear performance when I ran it in some code playgrounds... but in real environments like node and chrome, they there were no surprises.根据经验,当我在一些代码游乐场中运行Set.has时,它表现出线性性能……但在 node 和 chrome 等真实环境中,它们并没有什么意外。 I'm not sure what the playground was running on the back end, but perhaps a Set polyfill was used.我不确定操场在后端运行什么,但也许使用了 Set polyfill。 So be careful!所以要小心!

Here's my test cases , trimmed down to remove the randomness:这是我的测试用例,经过精简以消除随机性:

const makeArray = (size) => [...Array(size)].map(() => size);

const small =  makeArray(1000000);
const medium = makeArray(10000000);
const large =  makeArray(100000000);

const solution1 = arr => {
  console.time('Array.includes');
  arr.includes(arr.length - 1)
  console.timeEnd('Array.includes');
}

const solution2 = arr => {
  const set = new Set(arr)
  console.time('Set.has');
  set.has(arr.length-1)
  console.timeEnd('Set.has');
}


console.log('** Testing small array:');
solution1(small);
solution2(small);
console.log('** Testing medium array:');
solution1(medium);
solution2(medium);
console.log('** Testing large array:');
solution1(large);
solution2(large);

In Chrome, though:但是,在 Chrome 中:

** Testing small array:
VM183:10 Array.includes: 1.371826171875 ms
VM183:17 Set.has: 0.005859375 ms
VM183:25 ** Testing medium array:
VM183:10 Array.includes: 14.32568359375 ms
VM183:17 Set.has: 0.009765625 ms
VM183:28 ** Testing large array:
VM183:10 Array.includes: 115.695068359375 ms
VM183:17 Set.has: 0.0048828125 ms

In Node 16.5:在节点 16.5 中:

Testing small array:
Array.includes: 1.223ms
Set.has: 0.01ms
Testing medium array:
Array.includes: 11.41ms
Set.has: 0.054ms
Testing large array:
Array.includes: 127.297ms
Set.has: 0.047ms

So, yeah, Arrays are definitionly linear, and Sets are much faster.所以,是的,Arrays 绝对是线性的,而且 Set 更快。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM