简体   繁体   中英

Memory management in C#

Good afternoon,

I have some text files containing a list of (2-gram, count) pairs collected by analysing a corpus of newspaper articles which I need to load into memory when I start a given application I am developing. To store those pairs, I am using a structure like the following one:

private static Dictionary<String, Int64>[] ListaDigramas = new Dictionary<String, Int64>[27];

The ideia of having an array of dictionaries is due to efficiency questions, since I read somewhere that a long dictionary has a negative impact on performance. That said, every 2-gram goes into the dictionary that corresponds to it's first character's ASCII code minus 97 (or 26 if the first character is not a character in the range from 'a' to 'z').

When I load the (2-gram, count) pairs into memory, the application takes an overall 800Mb of RAM, and stays like this until I use a program called Memory Cleaner to free up memory. After this, the memory taken by the program goes down to the range 7Mb-100Mb, without losing functionality (I think).

Is there any way I can free up memory this way but without using an external application? I tried to use GC.Collect() but it doesn't work in this case.

Thank you very much.

你正在使用一个静态字段,所以很有可能一旦它被加载它永远不会被垃圾收集,所以除非你在这个字典上调用.Clear()方法,否则它可能不会被垃圾收集。

It is fairly mysterious to me how utilities like that ever make it onto somebody's machine. All they do is call EmptyWorkingSet(). Maybe it looks good in Taskmgr.exe, but it is otherwise just a way to keep the hard drive busy unnecessarily. You'll get the exact same thing by minimizing the main window of your app.

I don't know the details of how memory cleaner works, but given that it's unlikely to know the inner workings of a programs memory allocations, the best it can probably do is just cause pages to be swapped out to disk reducing the apparent memory usage of the program.

Garbage collection won't help unless you actually have objects you aren't using any more. If you are using your dictionaries, which the GC considers that you are since it is a static field, then all the objects in them are considered in use and must belong to the active memory of the program. There's no way around this.

What you are seeing is the total usage of the application. This is 800MB and will stay that way. As the comments say, memory cleaner makes it look like the application uses less memory. What you can try to do is access all values in the dictionary after you've run the memory cleaner. You'll see that the memory usage goes up again (it's read from swap).

What you probably want is to not load all this data into memory. Is there a way you can get the same results using an algorithm?

Alternatively, and this would probably be the best option if you are actually storing information here, you could use a database. If it's cumbersome to use a normal database like SQLExpress, you could always go for SQLite .

Thank you very much for all the answers. The data actually needs to be loaded during the whole running time of the application, so based on your answers I think there is nothing better to do... I could perhaps try an external database, but since I already need to deal with two other databases at the same time, I think it is not a good idea.

Do you think it is possible to be dealing with three databases at the same time and do not lose on performance?

If you are disposing of your applications resources correctly then the actual used memory may not be what you are seeing (if verifying through Task Manager).

The Garbage Collector will free up the unused memory at the best possible time. It usually isn't really a good idea to force collection either...see this post

"data actually needs to be loaded during the whole running time of the application" - why?

About the only other idea I could come up with, if you really want to keep your memory usage down, would be store the dictionary in a stream and compress it. Factors to consider would be how often you're accessing/inflating this data, and how compressible the data is. Text from newspaper articles would compress extremely well, and the performance hit might be less than you'd think.

Using an open-source library like SharpZipLib ( http://www.icsharpcode.net/opensource/sharpziplib/ ), your code would look something like:

MemoryStream stream = new MemoryStream();
BinaryFormatter formatter = new BinaryFormatter();

formatter.Serialize(stream, ListaDigramas);
byte[] dictBytes = stream.ToArray();

Stream zipStream = new DeflaterOutputStream(new MemoryStream());
zipStream.Write(dictBytes, 0, dictBytes.Length);

Inflating requires an InflaterInputStream and a loop to inflate the stream in chunks, but is fairly straightforward.

You'd have to play with the app to see if performance was acceptable. Keeping in mind, of course, that you'll still need enough memory to hold the dictionary when you inflate it for use (unless someone has a clever idea to work with the object in its compressed state).

Honestly, though, keeping it as-is in memory and letting Windows swap it to the page file is probably your best/fastest option.

Edit
I've never tried it, but you might be able to serialize directly to the compression stream, meaning the compression overhead is minimal (you'd still have the serialization overhead):

MemoryStream stream = new MemoryStream();
BinaryFormatter formatter = new BinaryFormatter();

Stream zipStream = new DeflaterOutputStream(new MemoryStream());

formatter.Serialize(zipStream, ListaDigramas);

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM