[英]What are some best practices to build memory-efficient Java applications?
Java programs can be very memory hungry. Java程序可能会非常消耗内存。 For example, a Double
object has 24 bytes: 8 bytes of data and 16 bytes of JVM-imposed overhead. 例如,一个Double
对象具有24个字节:8个字节的数据和16个字节的JVM施加的开销。 In general, the objects that represent the primitive types are very expensive. 通常,代表基本类型的对象非常昂贵。
The same happens for any collection in the Java Standard Library. Java标准库中的任何集合都会发生同样的情况。 There are even some counterintuitive facts such as a HashSet
being more memory hungry than a HashMap
, since a HashSet
contains a HashMap
inside ( http://docs.oracle.com/javase/7/docs/api/java/util/HashSet.html ). 甚至还有一些违反直觉的事实,例如HashSet
比HashMap
占用更多的内存,因为HashSet
在内部包含一个HashMap
( http://docs.oracle.com/javase/7/docs/api/java/util/HashSet。 html )。
Could you come up with some advice when modeling data and delegation of objects in high performance settings so that these "weaknesses" of Java are mitigated? 在高性能设置中对数据进行建模和对象委派时,您能否提出一些建议,以减轻Java的这些“弱点”?
Depends on the application, but generally speaking 取决于应用程序,但一般来说
Layout data structures in (parallel) arrays of primitives (并行)基元数组中的布局数据结构
Try to make big "flat" objects, inlining otherwise sensible sub-structures 尝试制造大型“扁平”物体,并嵌入其他明智的子结构
Specialize collections of primitives 专业化图元集合
Reuse objects, use object pools, ThreadLocals 重用对象,使用对象池,ThreadLocals
Go off-heap 乱堆
I cannot say these practices are "best", because they, unfortunately, make you suffer, losing the point why you are using Java, reduce flexibility, supportability, reliability, testability and other "good" properties of the codebase. 我不能说这些实践是“最佳”的,因为不幸的是,它们使您受苦,失去了使用Java的意义,降低了代码库的灵活性,可支持性,可靠性,可测试性和其他“良好”属性。
But, they certainly allow to lower memory footprint and GC pressure. 但是,它们肯定可以减少内存占用和GC压力。
One of the memory problems that are easy to overlook in Java is memory leakage. 在Java中容易忽略的内存问题之一是内存泄漏。 Nicholas Greene already pointed you to memory profiling. Nicholas Greene已经为您指出了内存配置文件。
Many people assume that Java's garbage collection prevents memory leaks, but that is not actually true - all it takes is one forgotten reference somewhere to keep an object around in perpetuity. 许多人认为Java的垃圾回收可以防止内存泄漏,但这实际上并非如此-它所需要的只是在某个地方遗忘了一个引用,以使对象永久存在。 Paradoxically, trying to optimize your program may introduce more opportunities for memory leaks because you end up with more complex data structures. 矛盾的是,尝试优化程序可能会为内存泄漏带来更多机会,因为最终会导致更为复杂的数据结构。
One example for a memory leak if you are implementing, for instance, a stack: 例如,如果要实现堆栈,则会发生内存泄漏的一个示例:
Integer stack[];
stack = new Integer[10];
int stackPtr = 0;
// a few push operation on our stack.
stack[stackPtr++] = new Integer(5);
stack[stackPtr++] = new Integer(3);
// and pop from the stack again
--stackPtr;
--stackPtr;
// at this point, the stack is logically empty, but
// the Integer objects are still referenced by the array,
// and are basically leaked.
The correct solution would have been: 正确的解决方案是:
stack[--stackPtr] = null;
Some techniques I use to reduce memory: 我用来减少内存的一些技术:
new String
to dispose of the old big string. 因此,请使用new String
处理旧的大字符串。 array[x|y<<4]
for a 16xN array. 线性数组小于多维数组,并且如果除最后一个维之外的所有维的大小都是2的幂,则计算索引最快:16xN数组的array[x|y<<4]
。 StringBuilder
with an initial capacity chosen such that it prevents internal reallocation in a typical circumstance. 使用选择的初始容量初始化集合和StringBuilder
,以便在典型情况下防止内部重新分配。
StringBuilder
instead of string concatenation, because the compiled class files use new StringBuilder()
without initial capacity to concatenate strings. 使用StringBuilder
而不是字符串串联,因为已编译的类文件使用new StringBuilder()
但没有初始能力来串联字符串。 If you have high performance constraints and need to use collections for simple types, you might take a look on some implementations of Primitive Collections for Java. 如果您具有较高的性能约束,并且需要将集合用于简单类型,则可以查看Java原始集合的一些实现。
Some are: 一些是:
Also, as a reference take a look at this question: Why can Java Collections not directly store Primitives types? 另外,作为参考,请看以下问题: 为什么Java集合不能直接存储原始类型?
Luís Bianchin already gave you a few libraries which implement optimal collections in Java. LuísBianchin已经为您提供了一些库,这些库可以用Java实现最佳集合。 Nevertheless, it seems that you are specially concerned about Java collections' memory allocation. 但是,您似乎特别担心Java集合的内存分配。 In that case, there are a few alternatives which are quite straight forward. 在这种情况下,有一些非常简单的选择。
You could use a cache to limit the memory the collection (the cache) can allocate. 您可以使用缓存来限制集合(缓存)可以分配的内存。 By doing that, you only load in main memory the most frequently used entries and you don't need to load the whole data set form disk/network/whatever. 这样,您只需将最常用的条目加载到主存储器中,而无需从磁盘/网络/任何内容加载整个数据集。 I highly recommend Guava Cache as it's very well documented and pretty mature. 我强烈建议您使用Guava Cache,因为它有据可查且非常成熟。
Sometimes a cache is not a solution for your problem. 有时,缓存不能解决您的问题。 For example, in an ETL solution, you might know you will only load each entry once. 例如,在ETL解决方案中,您可能知道您只会将每个条目加载一次。 For this scenario I recommend to go for persistent collections. 对于这种情况,我建议您进行持久性收集。 These are disk stored collections that are way faster than traditional databases but have nice Java APIs. 这些是磁盘存储的集合,比传统数据库要快得多,但是具有不错的Java API。 MapDB and PCollections are for me the best libraries. 对我来说, MapDB和PCollections是最好的库。
On top of that, if you really want to know the actual state of your program's memory allocation I highly recommend you to use a profiler. 最重要的是,如果您真的想知道程序的内存分配的实际状态,我强烈建议您使用探查器。 This way you will not only know how much memory you collections occupy, but also how the GC behaves over time. 这样,您不仅会知道集合占用了多少内存,而且还将知道GC随时间的行为。
In fact, you should only try an alternative to Java's collections and data structures if there is an actual memory problem, and that is something a profiler can tell you. 实际上,只有在存在实际内存问题时,才应尝试使用Java集合和数据结构的替代方法,这是探查器可以告诉您的。
The JDK has a profiler called VisualVM which does a great job. JDK有一个称为VisualVM的探查器,它可以很好地完成工作。 Nevertheless, I recommend you to use a commercial profiler if you can afford it. 不过,我建议您在负担得起的情况下使用商用分析器。 The commercial profilers usually have a low impact in the application's performance when compared to VisualVM. 与VisualVM相比,商业分析器通常对应用程序的性能影响很小。
Finally, that it's not strictly related to your question, but it's closely connected. 最后,它与您的问题并不严格相关,但紧密相关。 In case you want to serialize your Java objects into an optimal binary representation I recommend you Google Protocol Buffers in Java . 如果您想将Java对象序列化为最佳的二进制表示形式,我建议您使用Java中的Google协议缓冲区 。 Protocol buffers are ideal to transfer data structures thought the network using the least bandwidth possible and having a really fast coding/decoding. 协议缓冲区非常适合传输认为网络使用尽可能少的带宽并且具有真正快速的编码/解码的数据结构。
Well there is a lot of things you can do. 好吧,您可以做很多事情。
Here are a few problems and solutions: 以下是一些问题和解决方案:
When you change the value of a string in java, the string is not actually overwritten. 当您在Java中更改字符串的值时,该字符串实际上并未被覆盖。 Instead, a new string is created to replace the old one. 相反,将创建一个新字符串来替换旧字符串。 However, the old string still exists. 但是,旧字符串仍然存在。 This can be a problem when using RAM efficiently is a concern. 当需要有效使用RAM时,这可能是一个问题。 Here are some solutions to this problem: 以下是此问题的一些解决方案:
Writer and reader objects such as fileWriters and fileReaders also take up RAM. 写入器和读取器对象(例如fileWriters和fileReaders)也占用RAM。 If you have a lot of them, this can also cause problems. 如果您有很多,这也会引起问题。 Here are some solutions: 以下是一些解决方案:
Every object in java takes up memory. Java中的每个对象都占用内存。 When you have an object that you won't use anymore, it's not very convenient to keep it around. 当您有一个不再使用的对象时,将其保留在周围不是很方便。
Beware of early optimisation. 当心提早优化。 See When is optimisation premature? 请参阅优化何时过早?
While not knowing the exact requirements of your application or runtime environment, in my experience java was able to handle anything I threw it at. 虽然不知道您的应用程序或运行时环境的确切要求,但以我的经验,java能够处理我扔给它的任何东西。 Doing some profiling on your demo /proof of concept app might be time well spent if performance or garbage collection (you tagged memory leaks) is an issue. 如果性能或垃圾收集(标记为内存泄漏)成为问题,那么在演示/概念验证应用程序上进行一些性能分析可能会花费很多时间。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.