简体   繁体   中英

Threading issues in a Java HashMap

Something happened that I'm not sure should be possible. Obviously it is, because I've seen it, but I need to find the root cause & I was hoping you all could help.

We have a system that looks up latitude & longitude for a zipcode. Rather than access it every time, we cache the results in a cheap in-memory HashTable cache, since the lat & long of a zip code tend to change less often than we release.

Anyway, the hash is surrounded by a class that has a "get" and "add" method that are both synchronized. We access this class as a singleton.

I'm not claiming this is the best setup, but it's where we're at. (I plan to change to wrap the Map in a Collections.synchronizedMap() call ASAP.)

We use this cache in a multi-threaded environment, where we thread 2 calls for 2 zips (so we can calculate the distance between the two). These sometimes happen at very nearly the same time, so its very possible that both calls access the map at the same time.

Just recently we had an incident where two different zip codes returned the same value. Assuming that the initial values were actually different, is there any way that writing the values into the Map would cause the same value to be written for two different keys? Or, is there any way that 2 "gets" could cross wires and accidentally return the same value?

The only other explanation I have is that the initial data was corrupt (wrong values), but it seems very unlikely.

Any ideas would be appreciated. Thanks, Peter

(PS: Let me know if you need more info, code, etc.)

public class InMemoryGeocodingCache implements GeocodingCache
{

private Map cache = new HashMap();
private static GeocodingCache instance = new InMemoryGeocodingCache();

public static GeocodingCache getInstance()
{
    return instance;
}

public synchronized LatLongPair get(String zip)
{
    return (LatLongPair) cache.get(zip);
}

public synchronized boolean has(String zip)
{
    return cache.containsKey(zip);
}

public synchronized void add(String zip, double lat, double lon)
{
    cache.put(zip, new LatLongPair(lat, lon));
}
}


public class LatLongPair {
double lat;
double lon;

LatLongPair(double lat, double lon)
{
    this.lat = lat;
    this.lon = lon;
}

public double getLatitude()
{
    return this.lat;
}

public double getLongitude()
{
    return this.lon;
}
}

The code looks correct.

The only concern is that lat and lon are package visible, so the following is possible for the same package code:

LatLongPair llp = InMemoryGeocodingCache.getInstance().get(ZIP1);
llp.lat = x;
llp.lon = y;

which will obviously modify the in-cache object.

So make lat and lon final too.

PS Since your key (zip-code) is unique and small, there is no need to compute hash on every operation. It's easier to use TreeMap (wrapped into Collections.synchronizedMap()).

PPS Practical approach: write a test for two threads doing put/get operations in never-ending loop, validating the result on every get. You would need a multi-CPU machine for that though.

Why it's happening is hard to tell. More code could help.

You should probably just be using a ConcurrentHashMap anyway. This will be more efficient, in general, than a synchronized Map. You don't synchronize access to it, it handles it internally (more efficiently than you could).

One thing to look out for is if the key or the value might be changing, for instance if instead of making a new object for each insertion, you're just changing the values of an existing object and re-inserting it.

You also want to make sure that the key object defines both hashCode and equals in such a way that you don't violate the HashMap contract (ie if equals returns true, the hashCodes need to be the same, but not necessarily vice versa).

is it possible the LatLonPair is being modified? I'd suggest making the lat and lon fields final so that they are not accidentally being modified elsewhere in the code.

note, you should also make your singleton "instance" and the map reference "cache" final.

James is correct. Since you are handing back an Object its internals could be modified and anything holding a reference to that Object (Map) will reflect that change. Final is a good answer.

Here is the java doc on HashMap:

http://docs.oracle.com/javase/7/docs/api/java/util/HashMap.html

Note that this implementation is not synchronized. If multiple threads access a hash map concurrently, and at least one of the threads modifies the map structurally, it must be synchronized externally. (A structural modification is any operation that adds or deletes one or more mappings; merely changing the value associated with a key that an instance already contains is not a structural modification.) This is typically accomplished by synchronizing on some object that naturally encapsulates the map. If no such object exists, the map should be "wrapped" using the Collections.synchronizedMap method. This is best done at creation time, to prevent accidental unsynchronized access to the map:

Map m = Collections.synchronizedMap(new HashMap(...));

Or better, use java.util.concurrent.ConcurrentHashMap

I don't really see anything wrong with the code you posted that would cause the problem you described. My guess would be that it's a problem with the client of your geo-code cache that has problems.

Other things to consider (some of these are pretty obvious, but I figured I'd point them out anyway):

  1. Which two zip codes were you having problems with? Are you sure they don't have identical geocodes in the source system?
  2. Are you sure you aren't accidentally comparing two identical zip codes?

The presence of the has(String ZIP) method implies that you have something like the following in your code:

GeocodingCache cache = InMemoryGeocodingCache.getInstance();

if (!cache.has(ZIP)) {
    cache.add(ZIP, x, y);
}

Unfortunately this opens you up to sync problems between the has() returning false and the add() adding which could result in the issue you described.

A better solution would be to move the check inside the add method so the check and update are covered by the same lock like:

public synchronized void add(String zip, double lat, double lon) {
    if (cache.containsKey(zip)) return;
    cache.put(zip, new LatLongPair(lat, lon));
}

The other thing I should mention is that if you are using getInstance() as a singleton you should have a private constructor to stop the possibility of additional caches being created using new InMemoryGeocodingCache() .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM