简体   繁体   中英

Memory usage of Dictionaries in C#

I have some code that I added a nested dictionary to, of the following format

Dictionary<string, Dictionary<string, Dictionary<string, float>>>

After doing so I noticed the memory usage of my application shot up SIGNIFICANTLY. These dictionaries are keyed on strings that are often repeated, and there are many of these dictionaries, on the order of 10's of thousands.

In order to address this problem I hypothesized that the repeated strings were eating up a significant amount of memory. My solution was to hash the strings and use an integer instead (I would keep one copy of a rainbow table so I could reverse the hash when necessary)

Dictionary<int, Dictionary<int, Dictionary<int, float>>>

So I went to a memory profiler to see what kind of size reduction I could get. To my shock I actually found that the string storage was actually smaller in size (both normal and inclusive).

This doesn't make intuitive sense to me. Even if the compiler was smart enough to only store one copy of the string and use a reference, I would think that reference would be a pointer which is double the size of an int. I also didn't use any String.Intern methods so I don't know how this would have been accomplished (also is String.Intern the right method here?)

I'm very confused as to what's happening under the hood, any help would be appreciated

If your keys and values are objects, there's approximately 20 bytes of overhead for each element of a dictionary, plus several more bytes per dictionary. This is in addition to the space consumed by the keys and values themselves. if you have value types as keys and values, then it's 12 bytes plus the space consumed by the key and value for each item in the dictionary. This is if the number of elements equals the internal dictionary capacity. But typically there is more capacity than elements, so there is wasted space.

The wasted space will generally be a higher relative percentage if you have lots of dictionaries with a small number of elements than if you had one dictionary with many elements. If I go by your comment, your dictionaries with 8 elements will have a capacity of 11, those with 2 elements will have a capacity of 3, and those with 10 will have a capacity of 11.

If I understand your nesting counts, then a single top level dictionary will represent 184 dictionary elements. But if we count unused capacity, it's closer to 200 as far as space consumption. 200 * 20 = 4000 bytes for each top level dictionary. How many of those do you have? You say 10's of thousands of them in thousand of objects. Every 10,000 is going to consume about 38 MB of dictionary overhead. Add to that the objects stored in the dictionary.

A possible explanation of why your attempt to make it smaller by managing the hash codes would be if there are not a lot of duplicated references to your keys. Replacing an object reference key with an int key doesn't change the dictionary overhead amount, and you're adding the storage of your new collection of hash codes.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM