I'm playing around with C# String.Intern method and have one questions. Suppose I have a program that reads a text file line by line and adds this lines to a list of strings. Let's assume that this file consists of thousands of lines of the same string. If the text file is big enough I can see that my program consumes decent amount of RAM. Then if I use String.Intern method when I add lines to my list, consumptions of memory drops significantly and this means that string interning works fine. Then I want to check how many strings my dotnet process has through ProcessHacker. But whether I use String.Intern or not ProcessHacker shows the same huge amount of duplicating string. I expect it would show only one instance of the string since I use String.Intern.
What do I miss?
static void Main(string[] args)
{
List<string> list = new List<string>();
string filePath = @"C:\Users\User\Desktop\1.txt";
using (var fileStream = File.OpenRead(filePath))
{
using (var streamReader = new StreamReader(fileStream, Encoding.UTF8))
{
String line;
while ((line = streamReader.ReadLine()) != null)
{
list.Add(line);
//list.Add(String.Intern(line));
}
}
}
}
Every streamReader.ReadLine()
will always create a new string which will be garbage collected but until GC it will exist in memory. Your memory consumption can drop cause String.Intern
returns the system's reference to string, if it is interned; otherwise, a new reference to a string with the value of string and your list
will consist from references to the same instance of string which was interned making the ones created by streamReader.ReadLine()
available for GC.
var str = "Test"; // compile time constant will be interned by default
var str1 = new string(str.ToArray()); // simulate reading string
Console.WriteLine(object.ReferenceEquals(str1, string.Intern(str1))); // prints false
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.