[英]What is the time and space complexity of method retainAll when used on HashSets in Java?
For example in the code below: 例如,在下面的代码中:
public int commonTwo(String[] a, String[] b)
{
Set common = new HashSet<String>(Arrays.asList(a));
common.retainAll(new HashSet<String>(Arrays.asList(b)));
return common.size();
}
Lets take a peruse at the code . 让我们仔细阅读代码 。 The method retainAll
is inherited from AbstractCollection
and (at least in OpenJDK) looks like this: 方法retainAll
继承自AbstractCollection
并且(至少在OpenJDK中)如下所示:
public boolean retainAll(Collection<?> c) {
boolean modified = false;
Iterator<E> it = iterator();
while (it.hasNext()) {
if (!c.contains(it.next())) {
it.remove();
modified = true;
}
}
return modified;
}
There is one big this to note here, we loop over this.iterator()
and call c.contains
. 这里有一个很重要的注意事项,我们遍历this.iterator()
并调用c.contains
。 So the time complexity is n
calls to c.contains
where n = this.size()
and at most n
calls to it.remove()
. 所以时间复杂度是n
调用c.contains
,其中n = this.size()
,最多n
调用it.remove()
。
This important thing is that the contains
method is called on the other Collection
and so the complexity is dependant upon the complexity of the other Collection
contains
. 这个重要的是在另一个 Collection
上调用contains
方法,因此复杂性取决于其他Collection
contains
的复杂性。
So, whilst: 所以,同时:
Set<String> common = new HashSet<>(Arrays.asList(a));
common.retainAll(new HashSet<>(Arrays.asList(b)));
Would be O(a.length)
, as HashSet.contains
and HashSet.remove
are both O(1)
(amortized). 将是O(a.length)
HashSet.contains
O(a.length)
,因为HashSet.contains
和HashSet.remove
都是O(1)
(摊销)。
If you were to call 如果你打电话
common.retainAll(Arrays.asList(b));
Then due to the O(n)
contains
on Arrays.ArrayList
this would become O(a.length * b.length)
- ie by spending O(n)
copying the array to a HashSet
you actually make the call to retainAll
much faster. 然后由于O(n)
contains
在Arrays.ArrayList
这将变为O(a.length * b.length)
- 即通过花费O(n)
将数组复制到HashSet
您实际上更快地调用retainAll
。
As far as space complexity goes, no additional space (beyond the Iterator
) is required by retainAll
, but your invocation is actually quite expensive space-wise as you allocate two new HashSet
implementations which are actually fully fledged HashMap
. 就空间复杂性而言, retainAll
不需要额外的空间(超出Iterator
),但是你的调用实际上是非常昂贵的,因为你分配了两个新的HashSet
实现,它们实际上是完全成熟的HashMap
。
Two further things can be noted: 还可以注意到另外两件事:
HashSet
from the elements in a
- a cheaper collection that also has O(1)
remove from the middle such as an LinkedList
can be used. 没有理由分配一个HashSet
从元件a
-更便宜的集合,也有O(1)
从中间删除诸如LinkedList
可以被使用。 (cheaper in memory and also build time - a hash table is not built) (内存更便宜,也可以构建时间 - 不构建哈希表) b.size()
. 在创建新的集合实例时,您的修改将丢失,并且仅返回b.size()
。 The implementation can be found in the java.util.AbstractCollection
class. 可以在java.util.AbstractCollection
类中找到该实现。 The way it is implemented looks like this: 它的实现方式如下:
public boolean retainAll(Collection<?> c) {
Objects.requireNonNull(c);
boolean modified = false;
Iterator<E> it = iterator();
while (it.hasNext()) {
if (!c.contains(it.next())) {
it.remove();
modified = true;
}
}
return modified;
}
So it will iterate everything in your common
set and check if the collection that was passed as a parameter contains this element. 因此,它将迭代common
集中的所有内容,并检查作为参数传递的集合是否包含此元素。
In your case both are HashSet
s, thus it will be O(n), as contains should be O(1) amortized and iterating over your common
set is O(n). 在你的情况下,两者都是HashSet
,因此它将是O(n),因为contains应该是O(1)摊销并且在你的common
集上迭代是O(n)。
One improvement you can make, is simply not copy a
into a new HashSet
, because it will be iterated anyway you can keep a list. 你可以做的一个改进就是不将a
复制到一个新的HashSet
,因为无论如何它都会被迭代,你可以保留一个列表。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.