简体   繁体   中英

Object cache for C#

I'm doing a document viewer for some document format. To make it easier, let's say this is a PDF viewer, a Desktop application . One requirement for the software is the speed in rendering. So, right now, I'm caching the image for the next pages while the user is scrolling through the document.

This works, the UI is very responsive and it seems like the application is able to render the pages almost instantly....at a cost: the memory usage sometimes goes to 600MB. I cache it all in memory.

Now, I can cache to disk, I know, but doing that all the time is noticeably slower. What I would like to do is implement some cache (LRU?), where some of the cached pages (image objects) are on memory and most of them are on disk.

Before I embark on this, is there something in the framework or some library out there that will do this for me? It seems a pretty common enough problem. (This is a desktop application, not ASP.NET)

Alternatively, do you have other ideas for this problem?

I wrote an LRU Cache and some test cases, feel free to use it.

You can read through the source on my blog .

For the lazy (here it is minus the test cases):

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace LRUCache {
    public class IndexedLinkedList<T> {

        LinkedList<T> data = new LinkedList<T>();
        Dictionary<T, LinkedListNode<T>> index = new Dictionary<T, LinkedListNode<T>>();

        public void Add(T value) {
            index[value] = data.AddLast(value);
        }

        public void RemoveFirst() {
            index.Remove(data.First.Value);
            data.RemoveFirst();
        }

        public void Remove(T value) {
            LinkedListNode<T> node;
            if (index.TryGetValue(value, out node)) {
                data.Remove(node);
                index.Remove(value);
            }
        }

        public int Count {
            get {
                return data.Count;
            }
        }

        public void Clear() {
            data.Clear();
            index.Clear();
        }

        public T First {
            get {
                return data.First.Value;
            }
        }
    }
}

LRUCache

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace LRUCache {
    public class LRUCache<TKey, TValue> : IDictionary<TKey, TValue> {

        object sync = new object();
        Dictionary<TKey, TValue> data;
        IndexedLinkedList<TKey> lruList = new IndexedLinkedList<TKey>();
        ICollection<KeyValuePair<TKey, TValue>> dataAsCollection;
        int capacity;

        public LRUCache(int capacity) {

            if (capacity <= 0) {
                throw new ArgumentException("capacity should always be bigger than 0");
            }

            data = new Dictionary<TKey, TValue>(capacity);
            dataAsCollection = data;
            this.capacity = capacity;
        }

        public void Add(TKey key, TValue value) {
            if (!ContainsKey(key)) {
                this[key] = value;
            } else {
                throw new ArgumentException("An attempt was made to insert a duplicate key in the cache.");
            }
        }

        public bool ContainsKey(TKey key) {
            return data.ContainsKey(key);
        }

        public ICollection<TKey> Keys {
            get {
                return data.Keys;
            }
        }

        public bool Remove(TKey key) {
            bool existed = data.Remove(key);
            lruList.Remove(key);
            return existed;
        }

        public bool TryGetValue(TKey key, out TValue value) {
            return data.TryGetValue(key, out value);
        }

        public ICollection<TValue> Values {
            get { return data.Values; }
        }

        public TValue this[TKey key] {
            get {
                var value = data[key];
                lruList.Remove(key);
                lruList.Add(key);
                return value;
            }
            set {
                data[key] = value;
                lruList.Remove(key);
                lruList.Add(key);

                if (data.Count > capacity) {
                    data.Remove(lruList.First);
                    lruList.RemoveFirst();
                }
            }
        }

        public void Add(KeyValuePair<TKey, TValue> item) {
            Add(item.Key, item.Value);
        }

        public void Clear() {
            data.Clear();
            lruList.Clear();
        }

        public bool Contains(KeyValuePair<TKey, TValue> item) {
            return dataAsCollection.Contains(item);
        }

        public void CopyTo(KeyValuePair<TKey, TValue>[] array, int arrayIndex) {
            dataAsCollection.CopyTo(array, arrayIndex);
        }

        public int Count {
            get { return data.Count; }
        }

        public bool IsReadOnly {
            get { return false; }
        }

        public bool Remove(KeyValuePair<TKey, TValue> item) {

            bool removed = dataAsCollection.Remove(item);
            if (removed) {
                lruList.Remove(item.Key);
            }
            return removed;
        }


        public IEnumerator<KeyValuePair<TKey, TValue>> GetEnumerator() {
            return dataAsCollection.GetEnumerator();
        }


        System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator() {
            return ((System.Collections.IEnumerable)data).GetEnumerator();
        }

    }
}

For .NET 4.0, you can also use the MemoryCache from System.Runtime.Caching .

http://msdn.microsoft.com/en-us/library/system.runtime.caching.aspx

There's patterns & practices Enterprise Library (more specifically, Caching Application Block ), but it IMO tends to be over-engineered and overly complex.

The .NET Framework has always had the ability to keep weak references to objects.

Basically, weak references are references to objects that the runtime considers "unimportant" and that may be removed by a garbage collection run at any point in time. This can be used, for example, to cache things, but you'd have no control over what gets colected and what not.

On the other hand, it's very simple to use and it may just be what you need.

Dave

A classic trade-off situation. Keeping everything in memory will be fast at the cost of massively increased memory consumption, whilst retrieving from disc decreases memory consumption, but isn't as performant. However, you already know all this!

The built-in System.Web.Caching.Cache class is great, and I've used it to good effect many times myself in my ASP.NET applications (although mostly for database record caching), however, the drawback is that the cache will only run on one machine (typically a sole web server) and cannot be distributed across multiple machines.

If it's possible to "throw some hardware" at the problem, and it doesn't necessarily need to be expensive hardware, just boxes with plenty of memory, you could always go with a distributed caching solution. This will give you much more memory to play with whilst retaining (nearly) the same level of performance.

Some options for a distributed caching solution for .NET are:

Memcached.NET

indeXus.Net

or even Microsoft's own Velocity project.

How are you implementing your cache?

You can use the Cache class from System.Web.Caching , even in non-web applications, and it will purge items on an LRU basis if/when it needs the memory.

In a non-web application you'll need to use HttpRuntime.Cache to access the Cache instance.

Note that the documentation states that the Cache class isn't intended to be used outside of ASP.NET, although it's always worked for me. (I've never relied on it in any mission-critical app though.)

To addition to rsbarro 's answer to use MemoryCache I recommend to use PostSharp AOP as described in

http://www.codeproject.com/Articles/493971/Leveraging-MemoryCache-and-AOP-for-expensive-calls

There is an efficient, open sourced RAM virtualizer that uses MRU algorithm to keep freshest referenced objects in-memory and uses a fast, lightweight backing store (on Disk) for "paging".

Here is link in Code Project for a mini-article about it: http://www.codeproject.com/Tips/827339/Virtual-Cache

I hope you find it useful.

Caching application block and ASP.NET cache are both options however, although they do LRU, the only kind of disk utilization that happens is by memory paging. I think there are ways you can optimize this that are more specific to your goal to get a better output. Here are some thoughts:

  • Since it's an ASP.NET app, why not generate the images, write them to disk, and when the browser requests the next page have IIS serve it up. That keeps your ASP.NET worker process lower while letting IIS do what it's good at.
  • Use statistics about how a user interacts. LRU as implemented in the ASP.NET cache will typically apply to the individual image files - not much use for this scenario by the sounds of it. Instead, maybe some logic like: "If the user has scrolled X pages in the last Y seconds, then general the next X*Y images". Users scrolling quickly will get more pages generated; users reading slowly will need less cached.
  • Have IIS serve images from disk, and use a custom HTTP handler for images you really want to control the caching of.
  • Have the browser request the files ahead of time too, and rely on browser caching.
  • Once an image has been served to the client, is it fair to say it can pretty much be removed from the cache? That could reduce the footprint substantially.

I'd certainly avoid using a plain hash table though.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM