简体   繁体   English

TextWriter / StreamWriter高内存使用率

[英]TextWriter/StreamWriter high memory usage

I have a console app that reads in a large text file with 40k+ lines, each line is a key that I use in a search for which the results are written to a output file. 我有一个控制台应用程序,可以读取包含40k +行的大型文本文件,每行都是我在搜索中使用的键,搜索结果将被写入输出文件。 Issue is I leave this console app running for a while until it just suddenly closes and I realize that the process memory usage was really high was sitting at 1.6gb when I last saw it crash. 问题是,我让这个控制台应用程序运行了一段时间,直到它突然关闭,我才意识到我上次崩溃时进程内存使用率确实很高,为1.6gb。

I looked around and didn't find many answers I did try to use the gcAllowVeryLargeObjects but that seems like I'm just dodging the problem. 我环顾四周,没有找到很多尝试使用gcAllowVeryLargeObjects的答案,但似乎我只是在回避问题。

Below is a snippet from my main() of where I write out to the file. 下面是我写到文件的main()中的片段。 I can't seem to understand why the memory usage gets so high. 我似乎无法理解为什么内存使用率如此之高。 I flush the writer after every write (could it be because I'm keeping the file open for such a long period of time?). 每次写入后我都会刷新写入器(可能是因为我将文件保持打开状态的时间如此长?)。

TextWriter writer = new StreamWriter("output.csv", false));
foreach (var item in list)
 {
  Console.WriteLine("{0}/{1}", count, numofitem);
  var result = TableServiceContext.Read(p.id);
  if (result != null)
  {

   writer.WriteLine(String.Join(",", result.id,
   result.code,
   result.hash));

  }
  count++;
  writer.Flush();
 }
 writer.Close();

Edit: I have 32gb of ram on my computer so I am sure it's not running out of memory because I don't have enough ram. 编辑:我的计算机上有32gb的ram,所以我确定它没有用完内存,因为我没有足够的ram。

Edit2: changed the name of the repository as that was misleading. Edit2:更改了存储库的名称,因为这具有误导性。

If the average line length is 1KB then 40K lines is 40MB, and it nothing. 如果平均线长为1KB,则40K线为40MB,什么也没有。 That's why, I'm pretty sure problem is in your repository class. 这就是为什么,我很确定问题出在您的存储库类中。 If it is EF repository, try to recreate DbContext for each line. 如果它是EF存储库,请尝试为每行重新创建DbContext。

If you want to tune up your program, then, you can use the following method: Try to put timestamps to Console output, you can use Stopwatch class, and try to recreate your repository each 10 or 100 or N lines. 如果要调整程序,则可以使用以下方法:尝试将时间戳记添加到控制台输出,可以使用Stopwatch类,并尝试每10或100或N行重新创建存储库。 Then, looking at timestamps, you can find optimal N to use. 然后,查看时间戳,可以找到要使用的最佳N。

var timer = Stopwatch.StartNew();
...
Console.WriteLine(timer.ElapsedMilliseconds);

From looking at the code I think the problem isN't the Streamwriter but some memory leak in your repository. 通过查看代码,我认为问题不是Streamwriter,而是存储库中的一些内存泄漏。 Suggestions to check: 建议检查:

  • replace the repository by some dummy eg class dummy_repository with just the three properties id, value, hash. 用某些虚拟对象(例如类dummy_repository)替换存储库,仅使用三个属性id,value,hash。
  • likewise create a long "list" eg 40k small entries. 同样创建一个长“列表”,例如40k个小条目。
  • run your program and see if it still consumes memory (I am pretty sure it will not) 运行您的程序,看看它是否仍然消耗内存(我很确定它不会)
  • then step by step add back your original parts. 然后逐步添加回您的原始零件。 See what step causes the memory leak. 查看什么步骤导致内存泄漏。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM