[英]Why using Hashmap.containsKey run faster considerably than Arrays.binarySearch?
I have two lists of phone numbers. 我有两个电话号码清单。 1st list is a subset of 2nd list.
第一个列表是第二个列表的子集。 I ran two different algorithms below to determine which phone numbers are contained in both of two lists.
我在下面运行了两种不同的算法,以确定两个列表中都包含哪些电话号码。
It results in Way 2 ran within 5 seconds is faster considerably than Way 1 with 39 seconds. 结果,方法2在5秒内跑完比方法1快39秒。 I can't understand the reason why.
我不明白原因。
I appreciate your any comments. 感谢您的任何评论。
因为哈希是O(1),而二进制搜索是O(log N) 。
HashMap
relies on a very efficient algorithm called 'hashing' which has been in use for many years and is reliable and effective. HashMap
依赖于一种非常有效的算法,称为“哈希”,该算法已经使用了多年,并且可靠有效。 Essentially the way it works is to split the items in the collection into much smaller groups which can be accessed extremely quickly. 本质上,它的工作方式是将集合中的项目分成更小的组,可以非常快速地对其进行访问。 Once the group is located a less efficient search mechanism can be used to locate the specific item.
一旦找到组,就可以使用效率较低的搜索机制来查找特定项目。
Identifying the group for an item occurs via an algorithm called a 'hashing function'. 通过称为“散列函数”的算法来识别项目的组。 In Java the hashing method is
Object.hashCode()
which returns an int
representing the group. 在Java中,哈希方法是
Object.hashCode()
,该方法返回表示该组的int
。 As long as hashCode
is well defined for your class you should expect HashMap
to be very efficient which is exactly what you've found. 只要为您的类定义了
hashCode
,您就应该期望HashMap
非常高效,这正是您所发现的。
There's a very good discussion on the various types of Map
and which to use at Difference between HashMap, LinkedHashMap and TreeMap 关于各种
Map
类型,以及在HashMap,LinkedHashMap和TreeMap之间的区别时使用的Map
,都有很好的讨论
My shorthand rule-of-thumb is to always use HashMap
unless you can't define an appropriate hashCode
for your keys or the items need to be ordered (either natural or insertion). 我的简化法则是始终使用
HashMap
除非您无法为键定义适当的hashCode
或需要对项进行排序(自然或插入)。
Look at the source code for HashMap: it creates and stores a hash for each added (key, value) pair, then the containsKey() method calculates a hash for the given key, and uses a very fast operation to check if it is already in the map. 查看HashMap的源代码:它为每个添加的(键,值)对创建并存储哈希,然后containsKey()方法为给定键计算哈希,并使用非常快速的操作来检查它是否已经存在在地图上。 So most retrieval operations are very fast.
因此,大多数检索操作都非常快。
Way 1: 方法1:
Sorting: around O(nlogn)
排序:
O(nlogn)
Search: around O(logn)
搜索:
O(logn)
登录)左右
Way 2: 方式2:
Creating HashTable: O(n)
for small density (no collisions) 创建HashTable:
O(n)
用于小密度(无碰撞)
Contains: O(1)
包含:
O(1)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.