I wanted to use a HashSet<Long>
for storing a large list of unique numbers in memory. I calculated the approximate memory to be consumed (in 64 bit pointer size):
Long would take 16 bytes of space. So initially I multiplied the number of entries with 16 to get the memory. But in reality, the memory was much more than 16 bytes per entry. After that I studied HashSet
implementation. In short, in the underlying implementation, it actually stores an extra dummy object (12 bytes) with each entry of hashset . And a pointer (8 bytes) to next entry. Thus conceding extra 12+8 bytes per entry.
So total memory per entry: 16+12+8 = 36 bytes. But still when I ran the code and checked the memory, it was still much more than 36 bytes per entry.
My Question(In short) : How much memory does a HashSet
take (for instance, on 64 bit machine)?
You can measure exactly this size using this test:
long m1 = Runtime.getRuntime().freeMemory();
// create object (s) here
long m2 = Runtime.getRuntime().freeMemory();
System.out.println(m1 - m2);
to be run with -XX:-UseTLAB option
On my 64-bit HotSpot empty HashSet takes 480 bytes.
Why so much? Because HashSet has a complex structure (btw IDE in debug mode helps see actual fields). It is based on HashMap (Adapter pattern). So HashSet itself contains a reference to a HashMap. HashMap contains 8 fields. Actual data are in an array of Nodes. A Node has: int hash; K key; V value; Node next. HashSet uses only keys and puts a dummy object in values.
The size of objects is an implementation detail. There is no guarantee that if it's x bytes on one platform, on another it's also x bytes.
Long
is boxed as you know, but 16 bytes is wrong. The primitive long
takes 8 bytes but the size of the box around the long
is implementation dependent. According to this Hotspot related answer overhead words and padding means a boxed 4-byte int
can come to 24 bytes!
The byte alignment and padding mentioned in that (Hotspot specific) answer also would apply to the Entry
objects which would also push the consumption up.
使用的内存是32 * SIZE + 4 * CAPACITY +(16 * SIZE)beign“SIZE”元素的数量。
HashMap default size is 16 HashMapEntry entries. Every HashMapEntry has four objects on it (int keyHash, Object next, Object key, Object value). So it introduces overhead just for having empty entries by wrapping the elements. Additionally, hashmap has a expansion rate of 2x, so for 17 elements, you'll have 32 entries with 15 of them empty.
Easier way is check a heapdump with memory analyzer.
A HashSet
is a complicated beast. Off the top of my head and after reviewing some of the comments, here are some items consuming memory that you have not accounted for:
long
primitive gets boxed into a java.lang.Long
object and a reference added to the HashSet. Somebody mentioned that a
HashSet. Somebody mentioned that a
Long` object will be 24 bytes. Plus the reference, which is 8 bytes. ArrayList
, or LinkedList
, etc., but because hashing algorithms could produce collisions, the elements of the HashSet
must be put into collections, which are organized by hash code. Best case is an ArrayList
with just 1 element: Your Long
object. The default backing array size for ArrayList
is 10, so you have 10 object references within the object, so at least 80 bytes now per Long
. Since Long
is an integer, I suspect the hashing algorithm does a good job spreading things out. I'm not sure what would happen to a long whose value exceeded the Integer.MAX_VALUE. That would have to collide somehow due to the birthday paradox. HashSet
is basically a HashMap
where the value is not interesting. Under the hood, it creates a HashMap
, which has an array of buckets in it to represent the hash table. The array size is based on the capacity, which is not clear based on the number of elements you added. Long-story short, hash tables are a memory-intensive data structure. It's the space/time trade-off. You get, assuming a good hash distribution, constant time look-ups, at the cost of extra memory usage.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.