简体   繁体   English

java中一个对象的线程安全缓存

[英]Thread-safe cache of one object in java

let's say we have a CountryList object in our application that should return the list of countries. 假设我们的应用程序中有一个CountryList对象,它应返回国家/地区列表。 The loading of countries is a heavy operation, so the list should be cached. 加载国家是一项繁重的操作,因此应该缓存该列表。

Additional requirements: 其他要求:

  • CountryList should be thread-safe CountryList应该是线程安全的
  • CountryList should load lazy (only on demand) CountryList应加载延迟(仅按需)
  • CountryList should support the invalidation of the cache CountryList应支持缓存失效
  • CountryList should be optimized considering that the cache will be invalidated very rarely 考虑到缓存很少会失效,应优化CountryList

I came up with the following solution: 我提出了以下解决方案:

public class CountryList {
    private static final Object ONE = new Integer(1);

    // MapMaker is from Google Collections Library    
    private Map<Object, List<String>> cache = new MapMaker()
        .initialCapacity(1)
        .makeComputingMap(
            new Function<Object, List<String>>() {
                @Override
                public List<String> apply(Object from) {
                    return loadCountryList();
                }
            });

    private List<String> loadCountryList() {
        // HEAVY OPERATION TO LOAD DATA
    }

    public List<String> list() {
        return cache.get(ONE);
    }

    public void invalidateCache() {
        cache.remove(ONE);
    }
}

What do you think about it? 你怎么看待这件事? Do you see something bad about it? 你觉得它有什么坏处吗? Is there other way to do it? 还有其他办法吗? How can i make it better? 我怎样才能让它变得更好? Should i look for totally another solution in this cases? 我应该在这种情况下寻找另一种解决方案吗?

Thanks. 谢谢。

google collections actually supplies just the thing for just this sort of thing: Supplier 谷歌收藏实际上只提供这类东西: 供应商

Your code would be something like: 你的代码是这样的:

private Supplier<List<String>> supplier = new Supplier<List<String>>(){
    public List<String> get(){
        return loadCountryList();
    }
};


// volatile reference so that changes are published correctly see invalidate()
private volatile Supplier<List<String>> memorized = Suppliers.memoize(supplier);


public List<String> list(){
    return memorized.get();
}

public void invalidate(){
    memorized = Suppliers.memoize(supplier);
}

Thanks you all guys , especially to user " gid " who gave the idea. 谢谢你们所有人 ,尤其是那些提出这个想法的用户“ gid ”。

My target was to optimize the performance for the get() operation considering the invalidate() operation will be called very rare. 我的目标是优化get()操作的性能,因为invalidate()操作将被称为非常罕见。

I wrote a testing class that starts 16 threads, each calling get()-Operation one million times. 我写了一个测试类,它启动了16个线程,每个线程调用get() - 操作一百万次。 With this class I profiled some implementation on my 2-core maschine. 通过这个课程,我在我的2核机器上描述了一些实现。

Testing results 测试结果

Implementation              Time
no synchronisation          0,6 sec
normal synchronisation      7,5 sec
with MapMaker               26,3 sec
with Suppliers.memoize      8,2 sec
with optimized memoize      1,5 sec

1) "No synchronisation" is not thread-safe, but gives us the best performance that we can compare to. 1)“无同步”不是线程安全的,但为我们提供了可以比较的最佳性能。

@Override
public List<String> list() {
    if (cache == null) {
        cache = loadCountryList();
    }
    return cache;
}

@Override
public void invalidateCache() {
    cache = null;
}

2) "Normal synchronisation" - pretty good performace, standard no-brainer implementation 2)“正常同步” - 相当不错的性能,标准的无需实施

@Override
public synchronized List<String> list() {
    if (cache == null) {
        cache = loadCountryList();
    }
    return cache;
}

@Override
public synchronized void invalidateCache() {
    cache = null;
}

3) "with MapMaker" - very poor performance. 3)“与MapMaker” - 性能非常差。

See my question at the top for the code. 请在顶部查看我的问题代码。

4) "with Suppliers.memoize" - good performance. 4)“with Suppliers.memoize” - 良好的表现。 But as the performance the same "Normal synchronisation" we need to optimize it or just use the "Normal synchronisation". 但由于性能相同“正常同步”,我们需要对其进行优化或仅使用“正常同步”。

See the answer of the user "gid" for code. 有关代码,请参阅用户“gid”的答案。

5) "with optimized memoize" - the performnce comparable to "no sync"-implementation, but thread-safe one. 5) “具有优化的memoize” - 性能与“无同步 - 实现相当,但是线程安全的。 This is the one we need. 这是我们需要的。

The cache-class itself: (The Supplier interfaces used here is from Google Collections Library and it has just one method get(). see http://google-collections.googlecode.com/svn/trunk/javadoc/com/google/common/base/Supplier.html ) 缓存类本身:(此处使用的供应商界面来自Google Collections Library,它只有一个方法get()。请参阅http://google-collections.googlecode.com/svn/trunk/javadoc/com/google/ common / base / Supplier.html

public class LazyCache<T> implements Supplier<T> {
    private final Supplier<T> supplier;

    private volatile Supplier<T> cache;

    public LazyCache(Supplier<T> supplier) {
        this.supplier = supplier;
        reset();
    }

    private void reset() {
        cache = new MemoizingSupplier<T>(supplier);
    }

    @Override
    public T get() {
        return cache.get();
    }

    public void invalidate() {
        reset();
    }

    private static class MemoizingSupplier<T> implements Supplier<T> {
        final Supplier<T> delegate;
        volatile T value;

        MemoizingSupplier(Supplier<T> delegate) {
            this.delegate = delegate;
        }

        @Override
        public T get() {
            if (value == null) {
                synchronized (this) {
                    if (value == null) {
                        value = delegate.get();
                    }
                }
            }
            return value;
        }
    }
}

Example use: 使用示例:

public class BetterMemoizeCountryList implements ICountryList {

    LazyCache<List<String>> cache = new LazyCache<List<String>>(new Supplier<List<String>>(){
        @Override
        public List<String> get() {
            return loadCountryList();
        }
    });

    @Override
    public List<String> list(){
        return cache.get();
    }

    @Override
    public void invalidateCache(){
        cache.invalidate();
    }

    private List<String> loadCountryList() {
        // this should normally load a full list from the database,
        // but just for this instance we mock it with:
        return Arrays.asList("Germany", "Russia", "China");
    }
}

Whenever I need to cache something, I like to use the Proxy pattern . 每当我需要缓存某些东西时,我都喜欢使用代理模式 Doing it with this pattern offers separation of concerns. 使用这种模式可以解决问题。 Your original object can be concerned with lazy loading. 您的原始对象可能与延迟加载有关。 Your proxy (or guardian) object can be responsible for validation of the cache. 您的代理(或监护人)对象可以负责验证缓存。

In detail: 详细地:

  • Define an object CountryList class which is thread-safe, preferably using synchronization blocks or other semaphore locks. 定义一个对象的CountryList类,它是线程安全的,最好使用同步块或其他信号量锁。
  • Extract this class's interface into a CountryQueryable interface. 将此类的接口解压缩到CountryQueryable接口。
  • Define another object, CountryListProxy, that implements the CountryQueryable. 定义另一个实现CountryQueryable的对象CountryListProxy。
  • Only allow the CountryListProxy to be instantiated, and only allow it to be referenced through its interface. 仅允许实例化CountryListProxy,并且仅允许通过其接口引用它。

From here, you can insert your cache invalidation strategy into the proxy object. 从这里,您可以将缓存失效策略插入代理对象。 Save the time of the last load, and upon the next request to see the data, compare the current time to the cache time. 保存上次加载的时间,并在下次查看数据的请求时,将当前时间与缓存时间进行比较。 Define a tolerance level, where, if too much time has passed, the data is reloaded. 定义容差级别,如果时间过长,则重新加载数据。

As far as Lazy Load, refer here . 至于Lazy Load,请参阅此处

Now for some good down-home sample code: 现在为一些好的家庭示例代码:

public interface CountryQueryable {

    public void operationA();
    public String operationB();

}

public class CountryList implements CountryQueryable {

    private boolean loaded;

    public CountryList() {
        loaded = false;
    }

    //This particular operation might be able to function without
    //the extra loading.
    @Override
    public void operationA() {
        //Do whatever.
    }

    //This operation may need to load the extra stuff.
    @Override
    public String operationB() {
        if (!loaded) {
            load();
            loaded = true;
        }

        //Do whatever.
        return whatever;
    }

    private void load() {
        //Do the loading of the Lazy load here.
    }

}

public class CountryListProxy implements CountryQueryable {

    //In accordance with the Proxy pattern, we hide the target
    //instance inside of our Proxy instance.
    private CountryQueryable actualList;
    //Keep track of the lazy time we cached.
    private long lastCached;

    //Define a tolerance time, 2000 milliseconds, before refreshing
    //the cache.
    private static final long TOLERANCE = 2000L;

    public CountryListProxy() {
            //You might even retrieve this object from a Registry.
        actualList = new CountryList();
        //Initialize it to something stupid.
        lastCached = Long.MIN_VALUE;
    }

    @Override
    public synchronized void operationA() {
        if ((System.getCurrentTimeMillis() - lastCached) > TOLERANCE) {
            //Refresh the cache.
                    lastCached = System.getCurrentTimeMillis();
        } else {
            //Cache is okay.
        }
    }

    @Override
    public synchronized String operationB() {
        if ((System.getCurrentTimeMillis() - lastCached) > TOLERANCE) {
            //Refresh the cache.
                    lastCached = System.getCurrentTimeMillis();
        } else {
            //Cache is okay.
        }

        return whatever;
    }

}

public class Client {

    public static void main(String[] args) {
        CountryQueryable queryable = new CountryListProxy();
        //Do your thing.
    }

}

I'm not sure what the map is for. 我不确定地图的用途。 When I need a lazy, cached object, I usually do it like this: 当我需要一个懒惰的缓存对象时,我通常会这样做:

public class CountryList
{
  private static List<Country> countryList;

  public static synchronized List<Country> get()
  {
    if (countryList==null)
      countryList=load();
    return countryList;
  }
  private static List<Country> load()
  {
    ... whatever ...
  }
  public static synchronized void forget()
  {
    countryList=null;
  }
}

I think this is similar to what you're doing but a little simpler. 我认为这与你正在做的相似,但有点简单。 If you have a need for the map and the ONE that you've simplified away for the question, okay. 如果您需要地图以及您为问题简化过的那个,那么好吧。

If you want it thread-safe, you should synchronize the get and the forget. 如果你想要它是线程安全的,你应该同步get和forget。

What do you think about it? 你怎么看待这件事? Do you see something bad about it? 你觉得它有什么坏处吗?

Bleah - you are using a complex data structure, MapMaker, with several features (map access, concurrency-friendly access, deferred construction of values, etc) because of a single feature you are after (deferred creation of a single construction-expensive object). Bleah - 您正在使用复杂的数据结构MapMaker,它具有多个功能(映射访问,并发友好访问,值延迟构造等),因为您正在使用的单个功能(延迟创建单个构造昂贵的对象) 。

While reusing code is a good goal, this approach adds additional overhead and complexity. 虽然重用代码是一个很好的目标,但这种方法增加了额外的开销和复杂性。 In addition, it misleads future maintainers when they see a map data structure there into thinking that there's a map of keys/values in there when there is really only 1 thing (list of countries). 此外,当他们看到地图数据结构时会误导未来的维护者,以为当那里只有一件事(国家列表)时,会有一个键/值的映射。 Simplicity, readability, and clarity are key to future maintainability. 简单性,可读性和清晰度是未来可维护性的关键。

Is there other way to do it? 还有其他办法吗? How can i make it better? 我怎样才能让它变得更好? Should i look for totally another solution in this cases? 我应该在这种情况下寻找另一种解决方案吗?

Seems like you are after lazy-loading. 好像你是在懒惰加载后。 Look at solutions to other SO lazy-loading questions. 看看其他SO延迟加载问题的解决方案。 For example, this one covers the classic double-check approach (make sure you are using Java 1.5 or later): 例如,这个涵盖了经典的双重检查方法(确保您使用的是Java 1.5或更高版本):

How to solve the "Double-Checked Locking is Broken" Declaration in Java? 如何解决Java中的“双重检查已破坏”声明?

Rather than just simply repeat the solution code here, I think it is useful to read the discussion about lazy loading via double-check there to grow your knowledge base. 我不认为只是简单地在这里重复解决方案代码,而是通过仔细阅读有关延迟加载的讨论来扩展您的知识库。 (sorry if that comes off as pompous - just trying teach to fish rather than feed blah blah blah ...) (对不起,如果那是浮夸的 - 只是尝试教鱼而不是喂等等等等......)

There is a library out there (from atlassian ) - one of the util classes called LazyReference . 那里有一个库(来自atlassian ) - 一个名为LazyReference的util类。 LazyReference is a reference to an object that can be lazily created (on first get). LazyReference是对可以延迟创建的对象的引用(在第一次获取时)。 it is guarenteed thread safe, and the init is also guarenteed to only occur once - if two threads calls get() at the same time, one thread will compute, the other thread will block wait. 它是guarenteed线程安全的,并且init也被保证只发生一次 - 如果两个线程同时调用get(),一个线程将计算,另一个线程将阻塞等待。

see a sample code : 看一个示例代码

final LazyReference<MyObject> ref = new LazyReference() {
    protected MyObject create() throws Exception {
        // Do some useful object construction here
        return new MyObject();
    }
};

//thread1
MyObject myObject = ref.get();
//thread2
MyObject myObject = ref.get();

Your needs seem pretty simple here. 这里你的需求看起来很简单。 The use of MapMaker makes the implementation more complicated than it has to be. MapMaker的使用使得实现变得更加复杂。 The whole double-checked locking idiom is tricky to get right, and only works on 1.5+. 整个双重检查锁定成语很难做到正确,只适用于1.5+。 And to be honest, it's breaking one of the most important rules of programming: 说实话,它打破了最重要的编程规则之一:

Premature optimization is the root of all evil. 过早优化是万恶之源。

The double-checked locking idiom tries to avoid the cost of synchronization in the case where the cache is already loaded. 双重检查的锁定习惯用法试图在已经加载高速缓存的情况下避免同步的成本。 But is that overhead really causing problems? 但这个开销真的会引起问题吗? Is it worth the cost of more complex code? 更复杂的代码是否值得? I say assume it is not until profiling tells you otherwise. 我说假设它不会直到分析告诉你。

Here's a very simple solution that requires no 3rd party code (ignoring the JCIP annotation). 这是一个非常简单的解决方案,不需要第三方代码(忽略JCIP注释)。 It does make the assumption that an empty list means the cache hasn't been loaded yet. 它确实假设空列表意味着尚未加载缓存。 It also prevents the contents of the country list from escaping to client code that could potentially modify the returned list. 它还可以防止国家/地区列表的内容转义为可能修改返回列表的客户端代码。 If this is not a concern for you, you could remove the call to Collections.unmodifiedList(). 如果您不关心这一点,可以删除对Collections.unmodifiedList()的调用。

public class CountryList {

    @GuardedBy("cache")
    private final List<String> cache = new ArrayList<String>();

    private List<String> loadCountryList() {
        // HEAVY OPERATION TO LOAD DATA
    }

    public List<String> list() {
        synchronized (cache) {
            if( cache.isEmpty() ) {
                cache.addAll(loadCountryList());
            }
            return Collections.unmodifiableList(cache);
        }
    }

    public void invalidateCache() {
        synchronized (cache) {
            cache.clear();
        }
    }

}

这对我来说没什么问题(我假设MapMaker来自Google收藏?)理想情况下你不需要使用Map,因为你没有真正拥有密钥但是因为实现对任何调用者都是隐藏的我不认为这是一个很重要。

This is way to simple to use the ComputingMap stuff. 这是使用ComputingMap的简单方法。 You only need a dead simple implementation where all methods are synchronized, and you should be fine. 你只需要一个简单的实现,所有方法都是同步的,你应该没问题。 This will obviously block the first thread hitting it (getting it), and any other thread hitting it while the first thread loads the cache (and the same again if anyone calls the invalidateCache thing - where you also should decide whether the invalidateCache should load the cache anew, or just null it out, letting the first attempt at getting it again block), but then all threads should go through nicely. 这显然会阻止第一个线程命中它(获取它),以及任何其他线程在第一个线程加载缓存时命中它(如果有人调用invalidateCache事件,则再次相同 - 你还应该决定invalidateCache是​​否应该加载重新缓存,或者只是将其清空,让第一次尝试再次阻止它),但是所有线程都应该很好地完成。

Use the Initialization on demand holder idiom 使用Initialization on demand holder惯用法

public class CountryList {
  private CountryList() {}

  private static class CountryListHolder {
    static final List<Country> INSTANCE = new List<Country>();
  }

  public static List<Country> getInstance() {
    return CountryListHolder.INSTANCE;
  }

  ...
}

Follow up to Mike's solution above. 跟进Mike的解决方案。 My comment didn't format as expected... :( 我的评论没有按预期格式...... :(

Watch out for synchronization issues in operationB, especially since load() is slow: 注意operationB中的同步问题,特别是因为load()很慢:

public String operationB() {
    if (!loaded) {
        load();
        loaded = true;
    }

    //Do whatever.
    return whatever;
}

You could fix it this way: 你可以这样解决它:

public String operationB() {
    synchronized(loaded) {
        if (!loaded) {
            load();
            loaded = true;
        }
    }

    //Do whatever.
    return whatever;
}

Make sure you ALWAYS synchronize on every access to the loaded variable. 确保在每次访问加载的变量时始终保持同步。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM