简体   繁体   English

Hashmap 与数组性能

[英]Hashmap vs Array performance

Is it (performance-wise) better to use Arrays or HashMaps when the indexes of the Array are known?当 Array 的索引已知时,使用 Arrays 或 HashMaps 是否(性能方面)更好? Keep in mind that the 'objects array/map' in the example is just an example, in my real project it is generated by another class so I cant use individual variables.请记住,示例中的“对象数组/映射”只是一个示例,在我的实际项目中,它是由另一个类生成的,因此我无法使用单个变量。

ArrayExample:数组示例:

SomeObject[] objects = new SomeObject[2];
objects[0] = new SomeObject("Obj1");
objects[1] = new SomeObject("Obj2");

void doSomethingToObject(String Identifier){
    SomeObject object;
    if(Identifier.equals("Obj1")){
        object=objects[0];
    }else if(){
        object=objects[1];
    }
    //do stuff
}

HashMapExample:哈希映射示例:

HashMap objects = HashMap();
objects.put("Obj1",new SomeObject());
objects.put("Obj2",new SomeObject());

void doSomethingToObject(String Identifier){
    SomeObject object = (SomeObject) objects.get(Identifier);
    //do stuff
}

The HashMap one looks much much better but I really need performance on this so that has priority. HashMap 看起来好多了,但我真的需要在这方面的性能,以便优先考虑。

EDIT: Well Array's it is then, suggestions are still welcome编辑:那么阵列就是这样,仍然欢迎建议

EDIT: I forgot to mention, the size of the Array/HashMap is always the same (6)编辑:我忘了提到,数组/HashMap 的大小总是相同的 (6)

EDIT: It appears that HashMaps are faster Array: 128ms Hash: 103ms编辑: HashMaps 似乎更快 Array: 128ms Hash: 103ms

When using less cycles the HashMaps was even twice as fast当使用更少的周期时,HashMaps 的速度甚至是原来的两倍

test code:测试代码:

import java.util.HashMap;
import java.util.Random;

public class Optimizationsest {
private static Random r = new Random();

private static HashMap<String,SomeObject> hm = new HashMap<String,SomeObject>();
private static SomeObject[] o = new SomeObject[6];

private static String[] Indentifiers = {"Obj1","Obj2","Obj3","Obj4","Obj5","Obj6"};

private static int t = 1000000;

public static void main(String[] args){
    CreateHash();
    CreateArray();
    long loopTime = ProcessArray();
    long hashTime = ProcessHash();
    System.out.println("Array: " + loopTime + "ms");
    System.out.println("Hash: " + hashTime + "ms");
}

public static void CreateHash(){
    for(int i=0; i <= 5; i++){
        hm.put("Obj"+(i+1), new SomeObject());
    }
}

public static void CreateArray(){
    for(int i=0; i <= 5; i++){
        o[i]=new SomeObject();
    }
}

public static long ProcessArray(){
    StopWatch sw = new StopWatch();
    sw.start();
    for(int i = 1;i<=t;i++){
        checkArray(Indentifiers[r.nextInt(6)]);
    }
    sw.stop();
    return sw.getElapsedTime();
}



private static void checkArray(String Identifier) {
    SomeObject object;
    if(Identifier.equals("Obj1")){
        object=o[0];
    }else if(Identifier.equals("Obj2")){
        object=o[1];
    }else if(Identifier.equals("Obj3")){
        object=o[2];
    }else if(Identifier.equals("Obj4")){
        object=o[3];
    }else if(Identifier.equals("Obj5")){
        object=o[4];
    }else if(Identifier.equals("Obj6")){
        object=o[5];
    }else{
        object = new SomeObject();
    }
    object.kill();
}

public static long ProcessHash(){
    StopWatch sw = new StopWatch();
    sw.start();
    for(int i = 1;i<=t;i++){
        checkHash(Indentifiers[r.nextInt(6)]);
    }
    sw.stop();
    return sw.getElapsedTime();
}

private static void checkHash(String Identifier) {
    SomeObject object = (SomeObject) hm.get(Identifier);
    object.kill();
}

} }

HashMap uses an array underneath so it can never be faster than using an array correctly. HashMap 在底层使用数组,因此它永远不会比正确使用数组更快。

Random.nextInt() is many times slower than what you are testing, even using array to test an array is going to bias your results. Random.nextInt()比您正在测试的要慢很多倍,即使使用数组来测试数组也Random.nextInt()您的结果产生偏差。

The reason your array benchmark is so slow is due to the equals comparisons, not the array access itself.您的数组基准测试如此缓慢的原因是由于相等比较,而不是数组访问本身。

HashTable is usually much slower than HashMap because it does much the same thing but is also synchronized. HashTable通常比HashMap慢得多,因为它做了很多相同的事情,但也是同步的。

A common problem with micro-benchmarks is the JIT which is very good at removing code which doesn't do anything.微基准测试的一个常见问题是 JIT,它非常擅长删除不做任何事情的代码。 If you are not careful you will only be testing whether you have confused the JIT enough that it cannot workout your code doesn't do anything.如果您不小心,您将只会测试您是否已经将 JIT 弄糊涂了,以至于它无法锻炼您的代码并没有做任何事情。

This is one of the reason you can write micro-benchmarks which out perform C++ systems.这是您可以编写超越 C++ 系统的微基准测试的原因之一。 This is because Java is a simpler language and easier to reason about and thus detect code which does nothing useful.这是因为 Java 是一种更简单的语言,更容易推理并因此检测没有任何用处的代码。 This can lead to tests which show that Java does "nothing useful" much faster than C++ ;)这可能导致测试表明 Java“没有任何用处”比 C++ 快得多;)

arrays when the indexes are know are faster (HashMap uses an array of linked lists behind the scenes which adds a bit of overhead above the array accesses not to mention the hashing operations that need to be done)知道索引时的数组更快(HashMap 在幕后使用了一个链表数组,这在数组访问之上增加了一点开销,更不用说需要完成的散列操作)

and FYI HashMap<String,SomeObject> objects = HashMap<String,SomeObject>();和仅供参考HashMap<String,SomeObject> objects = HashMap<String,SomeObject>(); makes it so you won't have to cast使它这样你就不必施法

For the example shown, HashTable wins, I believe.对于显示的示例,我相信 HashTable 获胜。 The problem with the array approach is that it doesn't scale.数组方法的问题在于它不能扩展。 I imagine you want to have more than two entries in the table, and the condition branch tree in doSomethingToObject will quickly get unwieldly and slow.我想你想要在表中有两个以上的条目,并且 doSomethingToObject 中的条件分支树会很快变得笨拙和缓慢。

Logically, HashMap is definitely a fit in your case.从逻辑上讲, HashMap绝对适合您的情况。 From performance standpoint is also wins since in case of arrays you will need to do number of string comparisons (in your algorithm) while in HashMap you just use a hash code if load factor is not too high.从性能的角度来看也是胜利,因为在数组的情况下,您需要进行多次字符串比较(在您的算法中),而在 HashMap 中,如果负载因子不太高,您只需使用哈希码。 Both array and HashMap will need to be resized if you add many elements, but in case of HashMap you will need to also redistribute elements.如果添加许多元素,则数组和 HashMap 都需要调整大小,但在 HashMap 的情况下,您还需要重新分配元素。 In this use case HashMap loses.在这个用例中 HashMap 失败了。

Please, never, ever use extended if / else if / else if / else if / else if / else if cases like that. 请永远不要使用扩展if / else如果/ else if / else if / else if / else if / else if like else。 The reason I repeated it so many times is just to make you feel like your java interpreter does when it hits code-blocks like that. 我重复这么多次的原因只是让你觉得你的java解释器在碰到像这样的代码块时会这样做。

As soon as you have more than one else if, either use a hashmap, or a switch / case (java 7 will let you do it on Strings, and java 6 you have to use an enum). 如果你有多个其他的,请使用hashmap或switch / case(java 7将允许你在Strings上执行,而java 6则必须使用枚举)。 An even better solution for read-only checking is an ImmutableMap from a framework like guava; 用于只读检查的更好的解决方案是来自像番石榴这样的框架的ImmutableMap; they have highly optimized reads as they don't allow writes. 它们具有高度优化的读取,因为它们不允许写入。

Arrays will usually be faster than Collections classes.数组通常比集合类更快。

PS.附注。 You mentioned HashTable in your post.您在帖子中提到了 HashTable。 HashTable has even worse performance thatn HashMap. HashTable 的性能甚至比 HashMap 还要差。 I assume your mention of HashTable was a typo我认为你提到的 HashTable 是一个错字

"The HashTable one looks much much better " “HashTable 看起来好多了”

The example is strange.这个例子很奇怪。 The key problem is whether your data is dynamic.关键问题是您的数据是否是动态的。 If it is, you could not write you program that way (as in the array case).如果是这样,您就不能以这种方式编写程序(如在数组情况下)。 In order words, comparing between your array and hash implementation is not fair.换句话说,您的数组和哈希实现之间的比较是不公平的。 The hash implementation works for dynamic data, but the array implementation does not.散列实现适用于动态数据,但数组实现不适用。

If you only have static data (6 fixed objects), array or hash just work as data holder.如果您只有静态数据(6 个固定对象),则数组或散列仅用作数据持有者。 You could even define static objects.您甚至可以定义静态对象。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM