简体   繁体   English

Java优化,是否可以从hashMap获得?

[英]Java optimization, gain from hashMap?

I've been give some lovely Java code that has a lot of things like this (in a loop that executes about 1.5 million times). 我已经给出了一些可爱的 Java代码,其中包含很多类似的内容(一个循环执行约150万次)。

code = getCode();
for (int intCount = 1; intCount < vA.size() + 1; intCount++)
{
   oA = (A)vA.elementAt(intCount - 1);
   if (oA.code.trim().equals(code))
       currentName= oA.name;
}

Would I see significant increases in speed from switching to something like the following 从切换到以下内容后,我是否会看到速度显着提高

code = getCode();
//AMap is a HashMap
strCurrentAAbbreviation = (String)AMap.get(code);

Edit: The size of vA is approximately 50. The trim shouldn't even be necessary, but definitely would be nice to call that 50 times instead of 50*1.5 million. 编辑:VA的大小约为50.装饰甚至不应该是必要的,但肯定会是不错的调用50倍,而不是50 * 150万。 The items in vA are unique. vA中的项目是唯一的。

Edit: At the suggestion of several responders, I tested it. 编辑:在几个响应者的建议下,我对其进行了测试。 Results are at the bottom. 结果在底部。 Thanks guys. 多谢你们。

只有一种找出方法。

Ok, Ok, I tested it. 好吧,好吧,我测试了一下。

Results follow for your enlightenment: 结果说明如下:

Looping: 18391ms Hash: 218ms 循环:18391ms散列:218ms

Looping: 18735ms Hash: 234ms 循环:18735ms哈希:234ms

Looping: 18359ms Hash: 219ms 循环:18359ms哈希:219ms

I think I will be refactoring that bit .. 我想我会重构一下..

The framework: 框架:

public class OptimizationTest {
    private static Random r = new Random();
    public static void main(String[] args){
        final long loopCount = 1000000;
        final int listSize = 55;

        long loopTime = TestByLoop(loopCount, listSize);
        long hashTime = TestByHash(loopCount, listSize);
        System.out.println("Looping: " + loopTime + "ms");
        System.out.println("Hash: " + hashTime + "ms");
    }

    public static long TestByLoop(long loopCount, int listSize){
        Vector vA = buildVector(listSize);
        A oA;

        StopWatch sw = new StopWatch();
        sw.start();
        for (long i = 0; i< loopCount; i++){
            String strCurrentStateAbbreviation;
            int j = r.nextInt(listSize);
            for (int intCount = 1; intCount < vA.size() + 1; intCount++){
                oA = (A)vA.elementAt(intCount - 1);
                if (oA.code.trim().equals(String.valueOf(j)))
                    strCurrentStateAbbreviation = oA.value;
            }
        }
        sw.stop();
        return sw.getElapsedTime();
    }

    public static long TestByHash(long loopCount, int listSize){
        HashMap hm = getMap(listSize);
        StopWatch sw = new StopWatch();
        sw.start();
        String strCurrentStateAbbreviation;
        for (long i = 0; i < loopCount; i++){
            int j = r.nextInt(listSize);
            strCurrentStateAbbreviation = (String)hm.get(j);
        }
        sw.stop();
        return sw.getElapsedTime();
    }

    private static HashMap getMap(int listSize) {
        HashMap hm = new HashMap();
        for (int i = 0; i < listSize; i++){
            String code = String.valueOf(i);
            String value = getRandomString(2);
            hm.put(code, value);
        }
        return hm;
    }

    public static Vector buildVector(long listSize) 
    {
        Vector v = new Vector();
        for (int i = 0; i < listSize; i++){
            A a = new A();
            a.code = String.valueOf(i);
            a.value = getRandomString(2);
            v.add(a);
        }
        return v;
    }

    public static String getRandomString(int length){
        StringBuffer sb = new StringBuffer();
        for (int i = 0; i< length; i++){
            sb.append(getChar());
        }
        return sb.toString();
    }

    public static char getChar()
    {
        final String alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
        int i = r.nextInt(alphabet.length());
        return alphabet.charAt(i);
    }
}

Eh, there's a good chance that you would, yes. 嗯,是的,您很有可能会。 Retrieval from a HashMap is going to be constant time if you have good hash codes. 如果您具有良好的哈希码,则从HashMap检索将是恒定时间。

But the only way you can really find out is by trying it. 但是,您真正要找出的唯一方法就是尝试一下。

这取决于您的地图有多大,以及hashCode实现的良好程度(这样您就不会有大肠菌病)。

You should really do some real profiling to be sure if any modification is needed, as you may end up spending your time fixing something that is not broken. 您实际上应该进行一些实际的分析,以确保是否需要进行任何修改,因为您可能最终会花费时间来修复未损坏的问题。

What actually stands out to me a bit more than the elementAt call is the string trimming you are doing with each iteration. 在我看来,真正要比elementAt调用突出的是每次迭代所进行的字符串修剪。 My gut tells me that might be a bigger bottleneck, but only profiling can really tell. 我的直觉告诉我,这可能是一个更大的瓶颈,但只有剖析才能真正看出来。

Good luck 祝好运

I'd say yes, since the above appears to be a linear search over vA.size(). 我会说是的,因为以上内容似乎是对vA.size()的线性搜索。 How big is va? VA多大?

为什么不使用类似YourKit之类的东西(或插入另一个探查器)来查看循环这部分的成本。

Using a Map would certainly be an improvement that helps maintaining that code later on. 使用Map当然是一项改进,有助于以后维护该代码。

If you can use a map depends on whether the (vector?) contains unique codes or not. 是否可以使用地图取决于(vector?)是否包含唯一代码。 The for loop given would remember the last object in the list with a given code, which would mean a hash is not the solution. 给定的for循环会记住给定代码的列表中的最后一个对象,这意味着哈希不是解决方案。

For small (stable) list sizes simply converting the list to an array of objects would show a performance increase on top of some better readability. 对于较小(稳定)的列表大小,仅将列表转换为对象数组即可在提高可读性的基础上提高性能。

If none of the above holds, at least use an itarator to inspect the list, giving better readability and some (probable) performance increase. 如果以上条件均不成立,请至少使用itarator检查列表,以提高可读性并提高性能。

Depends. 要看。 How much memory you got? 你有多少内存?

I would guess much faster, but profile it. 我想速度要快得多,但是要对其进行分析。

I think the dominant factor here is how big vA is, since the loop needs to run n times, where n is the size of vA. 我认为这里的主要因素是vA有多大,因为循环需要运行n次,其中n是vA的大小。 With the map, there is no loop, no matter how big vA is. 有了地图,无论vA有多大,都没有循环。 So if n is small, the improvement will be small. 因此,如果n小,则改善将很小。 If it is huge, the improvement will be huge. 如果规模巨大,那么改善将是巨大的。 This is especially true because even after finding the matching element the loop keeps going! 这是特别正确的,因为即使找到匹配的元素,循环仍然继续! So if you find your match at element 1 of a 2 million element list, you still need to check the last 1,999,999 elements! 因此,如果您在200万个元素列表的元素1中找到匹配项,则仍然需要检查最后1,999,999个元素!

Yes, it'll almost certainly be faster. 是的,几乎可以肯定会更快。 Looping an average of 25 times (half-way through your 50) is slower than a hashmap lookup, assuming your vA contents decently hashable. 假设您的vA内容可以很好地散列,则平均循环25次(遍历50次)要比散列图查找慢。

However, speaking of your vA contents , you'll have to trim them as you insert them into your aMap, because aMap.get("somekey") will not find an entry whose key is "somekey ". 但是,说到vA内容 ,将它们插入aMap时必须修剪它们,因为aMap.get(“ somekey”)不会找到键为“ somekey”的条目。

Actually, you should do that as you insert into vA, even if you don't switch to the hashmap solution. 实际上,即使您不切换到哈希图解决方案,也应该在插入vA时这样做。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM