简体繁体 English

优化大型集合的垃圾回收

[英]Optimizing for garbage collection of large collections

原文 2011-03-08 21:04:02 1 4 java/ list/ collections/ garbage-collection

I am reading from database a large collections of this type List<Rows<Long,String,ByteBuffer>> 我正在从数据库中读取此类List<Rows<Long,String,ByteBuffer>>的大量集合

I then read the data from these list of rows one by one and put the data from them into container objects. 然后，我从这些行列表中逐一读取数据，并将它们中的数据放入容器对象中。 Should I de-reference the individual rows, in the list, to null as I move ahead with reading each of them OR should I finally de-reference them finally so that they can be garbage collected ? 我应该在继续阅读每一行时将列表中的各个行取消引用为null还是 应该最终取消对它们的引用以便可以对其进行垃圾回收？

Since each row is quite big consisting of large strings/blobs/ text content etc I am trying to optimize for garbage collection. 由于每一行都是很大的，包含大的字符串/斑点/文本内容等，因此我正在尝试针对垃圾回收进行优化。 I hope this is not called the premature optimization !? 希望这不称为过早优化！

4 个解决方案

If you haven't measured your program's performance, then it's a premature optimization. 如果您还没有衡量程序的性能，那是过早的优化。

(Not every optimization performed before measuring is premature, but these kind of micro-optimizations are.) （并非在测量之前执行的所有优化都为时过早，但是这类微优化确实如此。）

i would suggest dereferencing them. 我建议取消引用它们。 this is not premature optimization because unlike time, the amount of memory available to your program for accomplishing its is not as much under your control. 这不是过早的优化，因为与时间不同，程序完成该程序时可用的内存量不受您控制。

As larsmans said, this is the very definition of premature optimization. 正如larsmans所说，这正是过早优化的定义。 However, questions like these often pop up and rather than forgetting about them I like to add profiling points right away (wrapped by an on/off switch - like Logger.isEnabled()) and then move on. 但是，经常会出现诸如此类的问题，而不是忘记它们，我喜欢立即添加分析点（由开/关开关包裹-如Logger.isEnabled（）），然后继续进行。 Look at http://netbeans.org/features/java/profiler.html for an easy profiling tool/setup 查看http://netbeans.org/features/java/profiler.html以获得简单的分析工具/设置

As larsmans has mentioned, there is the disadvantage of complexity. 正如larsmans所提到的，存在复杂性的缺点。

But there may also be a performance disadvantage - nulling a reference involves writing to memory, and in a modern garbage-collected environment, writing to memory is not necessarily simply a store. 但是，也可能存在性能上的缺点-取消引用涉及写入内存，而在现代的垃圾收集环境中，写入内存并不一定只是存储。 There may also be some book-keeping for the benefit of the collector - look up 'write barrier' and 'card marking' in the context of garbage collection. 为了收集者的利益，可能还需要进行一些记账工作-在垃圾回收的上下文中查找“写障碍”和“卡片标记”。 Writing also has effects on processor caches; 写入也会对处理器缓存产生影响。 on a multiprocessor system, it will cause cache coherency traffic between processors, which consumes bandwidth. 在多处理器系统上，它将导致处理器之间的高速缓存一致性流量，从而消耗带宽。

Now, i don't think any of these effects are huge. 现在，我认为这些影响都不是很大。 But you should be aware that writes to memory are not always as cheap as you might think. 但是您应该意识到，写入内存并不总是像您想象的那样便宜。 That's why you have to profile before you optimise, and then profile again afterwards! 这就是为什么您必须在优化之前先进行概要分析，然后再进行概要分析！