简体   繁体   English

JVM空间复杂性详细信息:单链接列表与双链接列表

[英]JVM space complexity details: Singly Linked Lists vs Doubly Linked Lists

I have come across a curious situation where students (I'm a teaching assistant in this context) have to implement their own version of a Singly Linked List (SLL) and compare it empirically with the Java standard library implementation of a Doubly Linked List. 我遇到了一个奇怪的情况,即学生(我是这方面的助教)必须实现他们自己的单一链接列表(SLL)版本,并根据经验与双链表的Java标准库实现进行比较。

And that's where it gets weird: I've seen multiple students note that the DLL profiles at roughly 0.5% extra space utilization compared to an SLL containing an equal number of elements of the same type. 这就是奇怪的地方:我已经看到许多学生注意到,与包含相同数量相同类型元素的SLL相比,DLL配置文件的额外空间利用率大约为0.5%。 All the while basic analysis of the data structures tells me that a SLL has 2 references per node (1 to the next element, and 1 to the contained value) whereas a DLL has 3 (an additional one reference to the previous element). 对数据结构的所有基本分析告诉我,SLL每个节点有2个引用(1到下一个元素,1到包含的值),而DLL有3个(对前一个元素的另外一个引用)。 In other words, that is 50% more space usage per node (disregarding the size of the contained value). 换句话说,每个节点的空间使用量增加50%(忽略所包含值的大小)。

The contained values are mostly Integer value objects so I don't think the size of the contained values matters too much here. 包含的值主要是整数值对象,所以我不认为包含的值的大小在这里太重要了。

What causes this 2 orders of magnitude difference? 是什么导致这2个数量级的差异? I'm not entirely sure that "JVM/collections libraries optimization" can cover the whole difference; 我不完全确定“JVM /集合库优化”可以涵盖整体差异; otherwise it'd have to be one hell of a JVM/java std lib optimization. 否则它必须是一个JVM / java std lib优化的地狱。

The space used should be the same on the Oracle JVM/OpenJDK for a 64-bit JVM with 32-bit references (Compressed oops) 对于具有32位引用的64位JVM,Oracle JVM / OpenJDK上使用的空间应该相同(压缩oops)

For a Node with two references 对于具有两个引用的节点

header: 12 bytes
two references: 8 bytes
alignment padding: 4 bytes

total is 24 bytes per node as all objects are aligned by 8 byte offsets by default. 每个节点的总数为24个字节,因为默认情况下所有对象都由8个字节的偏移量对齐。

For a Node with three references 对于具有三个引用的节点

header: 12 bytes
three references: 12 bytes
alignment padding: 0 bytes

the total is 24 bytes again. 总数再次是24个字节。

The real question is why did you see any difference at all. 真正的问题是你为什么看到任何差异。 This is most likely due to inaccurate memory accounting. 这很可能是由于内存记帐不准确造成的。

The JVM uses a TLAB (Thread Local Allocation Buffer) This allows the threads in a JVM to grab chunks of memory and allocate concurrently from those chunks. JVM使用TLAB(线程本地分配缓冲区)这允许JVM中的线程获取内存块并从这些块中同时分配。 On the downside, you only see how much memory is used from the common Eden space ie you have no idea how much of each chunk is used. 在缺点方面,您只能看到从常见的Eden空间中使用了多少内存,即您不知道每个块的使用量是多少。

A simple way around this is to turn off the TLAB which gives you byte-by-byte memory account (at the cost of some performance) 一个简单的方法是关闭TLAB,它为您提供逐字节内存帐户(以某些性能为代价)

ig try -XX:-UseTLAB on the command line to disable the TLAB and you will see the size of every object allocated. ig try -XX:-UseTLAB在命令行上禁用-XX:-UseTLAB ,您将看到分配的每个对象的大小。

It's hard to see why there's any difference at all. 很难理解为什么会有任何差异。

First note that a Java object has significant overhead in the form of its header. 首先请注意,Java对象的标头形式有很大的开销。 That reduces your 50% expectation right there. 那会降低您50%的期望值。

Next, when you consider that the references are typically 4 bytes wide (given compressed OOPs on a 64-bit HotSpot), but that memory is always allocated in blocks whose size is divisible by 8, you can see that what was left as unused 4 bytes at the end of one structure becomes your third reference in the DLL example. 接下来,当您考虑引用通常为4个字节宽(在64位HotSpot上给定压缩OOP),但该内存始终以大小可被8整除的块分配时,您可以看到剩下的内容为4一个结构末尾的字节成为DLL示例中的第三个引用。

In addition to what Marko said about the memory overhead of each linked-list node object, your "Integer value objects" stored in these nodes might not be as small as you think. 除了Marko所说的每个链表节点对象的内存开销之外,存储在这些节点中的“整数值对象”可能没有您想象的那么小。 The element type of java's DLL is a generic parameter, and generic parameters in java are always objects, (never primitives,) so even though you may be adding int s to Java's DLL, they get converted to objects (see boxing/unboxing) and stored as objects. java的DLL的元素类型是一个泛型参数,并且java中的泛型参数始终是对象(绝不是基元),因此,即使您可能向Java的DLL中添加了int ,它们也会转换为对象(请参见装箱/拆箱),并且存储为对象。

If your students' SLLs store actual primitive int s, then I would actually expect their classes to occupy considerably less space than that of Java's DLL. 如果你的学生SLLs店实际基本int S,那么我真的希望他们的类,它占用了Java的DLL的相当少的空间。 If your students store Integer objects, then you should consider the fact that the space occupied by these objects further dilutes whatever difference exists between the expected space occupied by the two classes. 如果您的学生存储Integer对象,那么您应该考虑这些对象占用的空间进一步稀释这两个类占用的预期空间之间存在的差异。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM