Java Multimap <String,String> 与Trove

Question

I have a pretty large google Multimap<String,String> and was looking into ways to reduce the memory usage. 我有一个非常大的谷歌Multimap<String,String>并正在研究减少内存使用的方法。 In all of the examples I can find people are doing something like: 在所有的例子中，我发现人们正在做的事情如下：

Multimaps.newSetMultimap(
TDecorators.wrap(new TIntObjectHashMap<Collection<Integer>>()),
new Supplier<Set<Integer>>() {
public Set<Integer> get() {
  return TDecorators.wrap(new TIntHashSet());
}
});

which works for a Multimap <Integer,Integer> , is it possible to use Trove to wrap a <String,String> ? 适用于Multimap <Integer,Integer> ，是否可以使用Trove来包装<String,String> ？

Incase anyone is interested in the future I went with http://code.google.com/p/jdbm2/ to write the hash map to the filesystem. 任何人都对未来感兴趣我使用http://code.google.com/p/jdbm2/将哈希映射写入文件系统。

Answer 1

Guava's Multimaps are backed by standard JDK Collections which aren't optimized for memory usage. Guava的Multimaps支持标准JDK集合，这些集合未针对内存使用进行优化。 For example, ArrayListMultimap<K, V> is backed by HashMap<K, ArrayList<V>> and HashMultimap<K, V> is backed by HashMap<K, HashSet<V>> . 例如， ArrayListMultimap<K, V>由HashMap<K, ArrayList<V>> ， HashMultimap<K, V>由HashMap<K, HashSet<V>> 。

Eclipse Collections (formerly GS Collections ) has Multimaps backed by its own container types, UnifiedMap and UnifiedSet . Eclipse Collections （以前称为GS Collections ）具有由其自己的容器类型UnifiedMap和UnifiedSet支持的Multimaps。 UnifiedMap uses half the memory of HashMap and UnifiedSet uses a quarter the memory of HashSet . UnifiedMap使用HashMap一半内存，而UnifiedSet使用HashSet的四分之一内存。 The memory benefits you'll see will depend on whether you use a FastListMultimap or a UnifiedSetMultimap . 您将看到的内存优势取决于您使用的是FastListMultimap还是UnifiedSetMultimap 。

More detailed memory comparisons are available here . 这里有更详细的内存比较。

Note: I am a committer for Eclipse Collections. 注意：我是Eclipse Collections的提交者。

Answer 2

You could look at memory efficient variant of hash maps, such as this one: https://code.google.com/p/sparsehash/ 您可以查看内存有效的哈希映射变体，例如： https ： //code.google.com/p/sparsehash/

If your value strings are long enough, compression could be an option. 如果您的值字符串足够长，则可以选择压缩。 You could also look into disk backed solutions such as Ehcache, depending on your access statistics. 您还可以查看磁盘支持的解决方案，例如Ehcache，具体取决于您的访问统计信息。

Answer 3

Trove4j doesn't contain hashmap for string-to-string. Trove4j不包含string-to-string的hashmap。

See http://trove4j.sourceforge.net/javadocs/gnu/trove/map/hash/package-summary.html 见http://trove4j.sourceforge.net/javadocs/gnu/trove/map/hash/package-summary.html

Answer 4

An approach I use is to use Map<String,Collection<String>> where the values start out as ArrayList<String> and get promoted to HashSet<String> when the bucket hits some threshold, say 32 elements. 我使用的方法是使用Map<String,Collection<String>> ，其中值从ArrayList<String>开始，并在桶达到某个阈值（例如32个元素）时被提升为HashSet<String> 。

I have found this saves a lot of memory for small buckets. 我发现这为小型存储桶节省了大量内存。

Java Multimap <String,String> 与Trove

问题描述

4 个解决方案

解决方案1
6 2013-09-03 17:00:44

解决方案2
3 2013-03-22 20:38:11

解决方案3
0 已采纳 2013-03-22 20:27:48

解决方案4
0 2013-06-26 11:23:41

Java Multimap <String,String> 与Trove

问题描述

4 个解决方案

解决方案1 6 2013-09-03 17:00:44

解决方案2 3 2013-03-22 20:38:11

解决方案3 0 已采纳 2013-03-22 20:27:48

解决方案4 0 2013-06-26 11:23:41

解决方案1
6 2013-09-03 17:00:44

解决方案2
3 2013-03-22 20:38:11

解决方案3
0 已采纳 2013-03-22 20:27:48

解决方案4
0 2013-06-26 11:23:41