简体   繁体   中英

What is the overhead of using java.util.List for a single element list?

I have an in-memory key-value store (probably up to 1GB in size), where map String to a String . So far it's been implemented as Map<String, String> .

However, there's is a rare case in which I'll need to map to a list of Strings, so I'll need to change that to Map<String, List<String>> .

Since this is not a common case (Probably less than %1), I'm debating wether to separate these use cases into two different maps.

Does anyone knows what overhead (memory footprint and CPU) I should expect for having all lists in the map with only one element, vs directly having String objects?

Thanks!

Possibilities (in order of increasing memory foot print):

Map<String, String> map = new HashMap<>(); // Concatenated string values
List<String> get(String key) {
    return Arrays.asList(map.getOrDefault(key, "").split("\f"));
}

Map<String, String[]> map = new HashMap<>();
private static final String[] EMPTY = new String[0];
List<String> get(String key) {
    return Arrays.asList(map.getOrDefault(key, EMPTY));
}

Map<String, List<String>> map = new HashMap<>(); // LinkedList
List<String> get(String key) {
    return map.get(key);
}

(Just sample code. I did not deal well with empty strings.)

As said by others, measure space and speed . Also consider Set<String> as more optimal data structure instead of List . Consider Collections.singletonList("...") and emptyList() .

If strings are mostly Latin-1 consider java 9 uses more compact byte arrays (as opposed to java 8).

With large strings you could compress to byte[] using a GZipOutputStream .

And the final alternative, exhausting java -Xmx and physical memory: use a database .

As others already suggested, you'll get a definite answer (for a given machine / JVM combination) only by measuring. But it's possible to predict at least some results.

Adding to Joop's suggestions, I can imagine a few different approaches:

  • Use the straightforward Map<String, List<String>> , using ArrayList or a similar general-purpose List, then you get one additional (rather fat) wrapper object including a string array (maybe 128 bytes) per map entry. Implementation out-of-the-box, but wastes quite some memory.

  • Use a Map<String, List<String>> , and make sure to wrap the single-string values in Collections.singletonList() or a similar compact construct. Then you get one additional wrapper object (16 to 32 bytes) per single string. Smaller overhead, but needs special treatment when inserting single strings.

  • Use two maps, one Map<String, String> for the single strings and one Map<String, List<String>> for the multi-string case. Virtually no overhead, but needs special treatment both when inserting entries as well as when querying / iterating the map.

  • Joop's concatenated-strings solution collapses two or more String instances into one longer String , thus eliminating their individual overheads. This even results in a "negative" overhead, but needs special treatment both when inserting entries as well as when querying / iterating the map. The String splitting will consume a tiny bit of extra runtime when retrieving entries, even for the single-string case. [Although String.split() is based on regular expressions, which are very slow in the general case, Joop's solution matches a "fast path" in the String.split() implementation - kudos to Joop!]

Now, the choice is yours.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM