I'm doing a document viewer for some document format. To make it easier, let's say this is a PDF viewer, a Desktop application . One requirement for the software is the speed in rendering. So, right now, I'm caching the image for the next pages while the user is scrolling through the document.
This works, the UI is very responsive and it seems like the application is able to render the pages almost instantly....at a cost: the memory usage sometimes goes to 600MB. I cache it all in memory.
Now, I can cache to disk, I know, but doing that all the time is noticeably slower. What I would like to do is implement some cache (LRU?), where some of the cached pages (image objects) are on memory and most of them are on disk.
Before I embark on this, is there something in the framework or some library out there that will do this for me? It seems a pretty common enough problem. (This is a desktop application, not ASP.NET)
Alternatively, do you have other ideas for this problem?
I wrote an LRU Cache and some test cases, feel free to use it.
You can read through the source on my blog .
For the lazy (here it is minus the test cases):
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace LRUCache {
public class IndexedLinkedList<T> {
LinkedList<T> data = new LinkedList<T>();
Dictionary<T, LinkedListNode<T>> index = new Dictionary<T, LinkedListNode<T>>();
public void Add(T value) {
index[value] = data.AddLast(value);
}
public void RemoveFirst() {
index.Remove(data.First.Value);
data.RemoveFirst();
}
public void Remove(T value) {
LinkedListNode<T> node;
if (index.TryGetValue(value, out node)) {
data.Remove(node);
index.Remove(value);
}
}
public int Count {
get {
return data.Count;
}
}
public void Clear() {
data.Clear();
index.Clear();
}
public T First {
get {
return data.First.Value;
}
}
}
}
LRUCache
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace LRUCache {
public class LRUCache<TKey, TValue> : IDictionary<TKey, TValue> {
object sync = new object();
Dictionary<TKey, TValue> data;
IndexedLinkedList<TKey> lruList = new IndexedLinkedList<TKey>();
ICollection<KeyValuePair<TKey, TValue>> dataAsCollection;
int capacity;
public LRUCache(int capacity) {
if (capacity <= 0) {
throw new ArgumentException("capacity should always be bigger than 0");
}
data = new Dictionary<TKey, TValue>(capacity);
dataAsCollection = data;
this.capacity = capacity;
}
public void Add(TKey key, TValue value) {
if (!ContainsKey(key)) {
this[key] = value;
} else {
throw new ArgumentException("An attempt was made to insert a duplicate key in the cache.");
}
}
public bool ContainsKey(TKey key) {
return data.ContainsKey(key);
}
public ICollection<TKey> Keys {
get {
return data.Keys;
}
}
public bool Remove(TKey key) {
bool existed = data.Remove(key);
lruList.Remove(key);
return existed;
}
public bool TryGetValue(TKey key, out TValue value) {
return data.TryGetValue(key, out value);
}
public ICollection<TValue> Values {
get { return data.Values; }
}
public TValue this[TKey key] {
get {
var value = data[key];
lruList.Remove(key);
lruList.Add(key);
return value;
}
set {
data[key] = value;
lruList.Remove(key);
lruList.Add(key);
if (data.Count > capacity) {
data.Remove(lruList.First);
lruList.RemoveFirst();
}
}
}
public void Add(KeyValuePair<TKey, TValue> item) {
Add(item.Key, item.Value);
}
public void Clear() {
data.Clear();
lruList.Clear();
}
public bool Contains(KeyValuePair<TKey, TValue> item) {
return dataAsCollection.Contains(item);
}
public void CopyTo(KeyValuePair<TKey, TValue>[] array, int arrayIndex) {
dataAsCollection.CopyTo(array, arrayIndex);
}
public int Count {
get { return data.Count; }
}
public bool IsReadOnly {
get { return false; }
}
public bool Remove(KeyValuePair<TKey, TValue> item) {
bool removed = dataAsCollection.Remove(item);
if (removed) {
lruList.Remove(item.Key);
}
return removed;
}
public IEnumerator<KeyValuePair<TKey, TValue>> GetEnumerator() {
return dataAsCollection.GetEnumerator();
}
System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator() {
return ((System.Collections.IEnumerable)data).GetEnumerator();
}
}
}
For .NET 4.0, you can also use the MemoryCache
from System.Runtime.Caching
.
http://msdn.microsoft.com/en-us/library/system.runtime.caching.aspx
There's patterns & practices Enterprise Library (more specifically, Caching Application Block ), but it IMO tends to be over-engineered and overly complex.
The .NET Framework has always had the ability to keep weak references to objects.
Basically, weak references are references to objects that the runtime considers "unimportant" and that may be removed by a garbage collection run at any point in time. This can be used, for example, to cache things, but you'd have no control over what gets colected and what not.
On the other hand, it's very simple to use and it may just be what you need.
Dave
A classic trade-off situation. Keeping everything in memory will be fast at the cost of massively increased memory consumption, whilst retrieving from disc decreases memory consumption, but isn't as performant. However, you already know all this!
The built-in System.Web.Caching.Cache class is great, and I've used it to good effect many times myself in my ASP.NET applications (although mostly for database record caching), however, the drawback is that the cache will only run on one machine (typically a sole web server) and cannot be distributed across multiple machines.
If it's possible to "throw some hardware" at the problem, and it doesn't necessarily need to be expensive hardware, just boxes with plenty of memory, you could always go with a distributed caching solution. This will give you much more memory to play with whilst retaining (nearly) the same level of performance.
Some options for a distributed caching solution for .NET are:
or even Microsoft's own Velocity project.
How are you implementing your cache?
You can use the Cache
class from System.Web.Caching
, even in non-web applications, and it will purge items on an LRU basis if/when it needs the memory.
In a non-web application you'll need to use HttpRuntime.Cache
to access the Cache
instance.
Note that the documentation states that the Cache
class isn't intended to be used outside of ASP.NET, although it's always worked for me. (I've never relied on it in any mission-critical app though.)
To addition to rsbarro 's answer to use MemoryCache I recommend to use PostSharp AOP as described in
http://www.codeproject.com/Articles/493971/Leveraging-MemoryCache-and-AOP-for-expensive-calls
There is an efficient, open sourced RAM virtualizer that uses MRU algorithm to keep freshest referenced objects in-memory and uses a fast, lightweight backing store (on Disk) for "paging".
Here is link in Code Project for a mini-article about it: http://www.codeproject.com/Tips/827339/Virtual-Cache
I hope you find it useful.
Caching application block and ASP.NET cache are both options however, although they do LRU, the only kind of disk utilization that happens is by memory paging. I think there are ways you can optimize this that are more specific to your goal to get a better output. Here are some thoughts:
I'd certainly avoid using a plain hash table though.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.