Performance of extra string comparisons vs HashMap lookups

Question

Assume I am running either of the code snippets below for a list of 1000 Event entries (in allEventsToAggregate ). Would I see a performance improvement in the first implementation if the events in allEventsToAggregate are sorted by customerId , with each customer having roughly 3 events? This is essentially a question of string comparison vs. HashMap lookup performance.

Option 1:

Map<String, List<Event>> eventsByCust = new HashMap<String, List<Event>>();
List<Event> thisCustEntries;
String lastCust = null;
for (Event thisEvent : allEventsToAggregate) {
    if (!thisEvent.getCustomerId().equals(lastCust)) {
        thisCustEntries = eventsByCust.get(thisEvent.getCustomerId());
        if (thisCustEntries == null) {
            thisCustEntries = new ArrayList<Event>();
        }
    }
    thisCustEntries.add(thisEvent);
    eventsByCust.put(thisEvent.getCustomerId(), thisCustEntries);
    lastCust = thisEvent.getCustomerId();
}

Option 2:

Map<String, List<Event>> eventsByCust = new HashMap<String, List<Event>>();
for (Event thisEvent : allEventsToAggregate) {
    List<Event> thisCustEntries = eventsByCust.get(thisEvent.getCustomerId());
    if (thisCustEntries == null) {
        thisCustEntries = new ArrayList<Event>();
    }
    thisCustEntries.add(thisEvent);
}

Answer 1

Would I see a performance improvement

Almost certainly not. Unless this block represents a critical inner loop of your application, any marginal performance gains will almost certainly be unnoticeable.

Consequently, I would go with the second version of the code, as its a clearer expression of your intent and so will be easier to maintain (as well as being slightly less prone to subtle bugs in the first place). Maintainability almost certainly trumps making the application 0.001% faster.

Answer 2

1) Remember that a successful retrieval of an item from a HashMap requires a string compare to confirm that you really have found the correct item.

2) We seem to be talking about very small differences in execution time, not real algorithmic improvements. Is it really worth losing readability for this?

3) For small differences, the only way to really know will be to actually time the thing in practice - in fact not only to run a comparison, but to organise it as a fully fledged scientific experiment. There is just too much too to worry about these days about what your compiler and run time system has chosen to optimise, what cpu caching or VM page faulting means, and what Java garbage collection thinks of your algorithm. Then, of course, you may well find that you get different answers for different versions of Java or on hardware with different cpus, motherboards, or memory sizes, or even how long the system has been running and so how much time it has had to migrate its disk contents into memory cache, or JIT-compile relevant bits of Java, or whatever.

Performance of extra string comparisons vs HashMap lookups

Question

2 answers

solution1
3 ACCPTED 2010-08-06 16:55:07

solution2
2 2010-08-07 05:14:15

Performance of extra string comparisons vs HashMap lookups

Question

2 answers

solution1 3 ACCPTED 2010-08-06 16:55:07

solution2 2 2010-08-07 05:14:15

solution1
3 ACCPTED 2010-08-06 16:55:07

solution2
2 2010-08-07 05:14:15