简体   繁体   English

数据流管道保持在内存中

[英]Dataflow Pipeline holding on to memory

I've created a Dataflow pipeline consisting of 4 blocks (which includes one optional block) which is responsible for receiving a query object from my application across HTTP and retrieving information from a database, doing an optional transform on that data, and then writing the information back in the HTTP response. 我创建了一个由4个块(包括一个可选块)组成的数据流管道,该数据流负责通过HTTP从我的应用程序接收查询对象,并从数据库中检索信息,对该数据进行可选转换,然后编写信息返回到HTTP响应中。

In some testing I've done I've been pulling down a significant amount of data from the database (570 thousand rows) which are stored in a List object and passed between the different blocks and it seems like even after the final block has been completed the memory isn't being released. 在一些测试中,我已经从数据库中提取了大量数据(57万行),这些数据存储在List对象中并在不同的块之间传递,即使最后一个块完成内存未释放。 Ram usage in Task Manager will spike up to over 2 GB and I can observe several large spikes as the List hits each block. 任务管理器中的Ram使用量将增加到2 GB以上,并且当列表触及每个块时,我可以观察到一些大的增加。

The signatures for my blocks look like this: 我的块的签名如下所示:

private TransformBlock<HttpListenerContext, Tuple<HttpListenerContext, QueryObject>> m_ParseHttpRequest;
private TransformBlock<Tuple<HttpListenerContext, QueryObject>, Tuple<HttpListenerContext, QueryObject, List<string>>> m_RetrieveDatabaseResults;
private TransformBlock<Tuple<HttpListenerContext, QueryObject, List<string>>, Tuple<HttpListenerContext, QueryObject, List<string>>> m_ConvertResults;
private ActionBlock<Tuple<HttpListenerContext, QueryObject, List<string>>> m_ReturnHttpResponse;

They are linked as follows: 它们的链接如下:

m_ParseHttpRequest.LinkTo(m_RetrieveDatabaseResults);
m_RetrieveDatabaseResults.LinkTo(m_ConvertResults, tuple => tuple.Item2 is QueryObjectA);   
m_RetrieveDatabaseResults.LinkTo(m_ReturnHttpResponse, tuple => tuple.Item2 is QueryObjectB);           
m_ConvertResults.LinkTo(m_ReturnHttpResponse);

Is it possible that I can set up the pipeline such that once each block is done with the list they no longer need to hold on to it as well as once the entire pipeline is completed that the memory is released? 我是否可以设置管道,以便每个块用完列表之后,它们就不再需要保留在列表上,而一旦整个管道完成,就释放了内存?

  1. There is no reason why dataflow blocks should hold references to already processed messages. 没有理由为什么数据流块应该保留对已处理消息的引用。 Because of that, I believe they don't do that. 因此,我相信他们不会这样做。 If they actually do, then I think that's a bug that you should report. 如果他们确实这样做了,那么我认为那是您应该报告的错误。

  2. Big RAM usage reported in Task Manager does not necessarily mean the objects weren't collected. 任务管理器中报告的大内存使用情况并不一定意味着未收集对象。 It could also mean that the framework didn't return the GCed memory to the OS yet. 这也可能意味着该框架尚未将GCed内存返回给操作系统。 I believe that when you have enough free memory, the framework is quite lazy about returning memory, because there is no good reason to do so. 我相信,当您有足够的可用内存时,该框架对于返回内存非常懒惰,因为没有充分的理由这样做。

    What you should do instead is to use more appropriate tools. 相反,您应该使用更合适的工具。 For example, you can use .Net performance counters to monitor the GC quite closely. 例如,您可以使用.Net性能计数器来密切监视GC。 You could also use a memory profiler to see more detailed information about memory usage and, if you actually have a memory leak, what causes it. 您还可以使用内存探查器查看有关内存使用情况的更多详细信息,如果确实有内存泄漏,则是由什么原因引起的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM