简体繁体 English

迭代遍历TreeSet比迭代Java中的HashSet要慢吗？

[英]Is iterating through a TreeSet slower than iterating through a HashSet in Java?

原文 2013-11-05 20:31:39 7 3 java/ iteration/ hashset/ treeset

I'm running some benchmarks. 我正在运行一些基准测试。 One of my tests depends on order, so I'm using a TreeSet for that. 我的一个测试取决于顺序，所以我正在使用TreeSet。 My second test doesn't, so I'm using a HashSet for it. 我的第二个测试没有，所以我正在使用HashSet。

I know that insertion is slower for the TreeSet. 我知道TreeSet的插入速度较慢。 But what about iterating through all elements? 但是如何迭代所有元素呢？

3 个解决方案

From a similar post ( Hashset vs Treeset ): 从类似的帖子（ Hashset vs Treeset ）：

HashSet is much faster than TreeSet (constant-time versus log-time for most operations like add, remove and contains) but offers no ordering guarantees like TreeSet. HashSet比TreeSet快得多（对于大多数操作，例如add，remove和contains，常量时间与日志时间相比），但不提供像TreeSet这样的排序保证。

HashSet: HashSet的：

class offers constant time performance for the basic operations (add, remove, contains and size). class为基本操作提供恒定的时间性能（添加，删除，包含和大小）。
it does not guarantee that the order of elements will remain constant over time 它不能保证元素的顺序会随着时间的推移而保持不变
iteration performance depends on the initial capacity and the load factor of the HashSet. 迭代性能取决于HashSet的初始容量和加载因子 。
- It's quite safe to accept default load factor but you may want to specify an initial capacity that's about twice the size to which you expect the set to grow. 接受默认加载因子是非常安全的，但您可能希望指定的初始容量大约是您希望该组增长的大小的两倍。

TreeSet: TreeSet中：

guarantees log(n) time cost for the basic operations (add, remove and contains) 保证基本操作的log（n）时间成本（添加，删除和包含）
guarantees that elements of set will be sorted (ascending, natural, or the one specified by you via it's constructor) 保证set的元素将被排序（升序，自然，或者你通过它的构造函数指定的元素）
doesn't offer any tuning parameters for iteration performance 不提供迭代性能的任何调整参数
offers a few handy methods to deal with the ordered set like first() , last() , headSet() , and tailSet() etc 提供了一些方便的方法来处理有序集合，如first() ， last() ， headSet()和tailSet()等

Important points: 重点：

Both guarantee duplicate-free collection of elements 两者都保证元素的无重复收集
It is generally faster to add elements to the HashSet and then convert the collection to a TreeSet for a duplicate-free sorted traversal. 通常，将元素添加到HashSet然后将集合转换为TreeSet以进行无重复的排序遍历通常会更快。
None of these implementation are synchronized. 这些实现都不是同步的。 That is if multiple threads access a set concurrently, and at least one of the threads modifies the set, it must be synchronized externally. 也就是说，如果多个线程同时访问一个集合，并且至少有一个线程修改了该集合，则必须在外部进行同步。
LinkedHashSet is in some sense intermediate between HashSet and TreeSet . LinkedHashSet在某种意义上介于HashSet和TreeSet之间。 Implemented as a hash table with a linked list running through it, however it provides insertion-ordered iteration which is not same as sorted traversal guaranteed by TreeSet . 实现为具有贯穿其的链表的哈希表，但是它提供了插入顺序的迭代，这与TreeSet保证的排序遍历不同 。

So choice of usage depends entirely on your needs but I feel that even if you need an ordered collection then you should still prefer HashSet to create the Set and then convert it into TreeSet. 因此，使用的选择完全取决于您的需求，但我觉得即使您需要有序集合，您仍然应该更喜欢HashSet来创建Set，然后将其转换为TreeSet。

eg Set<String> s = new TreeSet<String>(hashSet); 例如， Set<String> s = new TreeSet<String>(hashSet);

TreeSets internally uses TreeMaps which are Red Black Trees (special type of BST ) . TreeSets内部使用TreeMaps ，它是Red Black Trees （特殊类型的BST ）。

BST Inorder Traversal is O(n) BST Inorder Traversal是O(n)

HashSets internally uses HashMaps which use an array for holding Entry objects. HashSets内部使用HashMaps ，它使用array来保存Entry对象。

Here also traversal should be O(n) . 这里的遍历也应该是O(n) 。

Unless you write a benchmark it is going to be difficult to prove which is faster. 除非你写一个基准测试，否则很难证明哪个更快。

If you want stable ordering with (nearly) the performance of a HashSet , then use a LinkedHashSet . 如果你想要（几乎） HashSet的性能稳定排序，那么使用LinkedHashSet 。 You will still get constant-time operations, whereas I would assume a TreeSet will get you logarithmic time. 你仍然会得到恒定时间的操作，而我认为TreeSet会让你获得对数时间。