简体   繁体   English

Java Multimap <String,String> 与Trove

[英]Java Multimap<String,String> with Trove

I have a pretty large google Multimap<String,String> and was looking into ways to reduce the memory usage. 我有一个非常大的谷歌Multimap<String,String>并正在研究减少内存使用的方法。 In all of the examples I can find people are doing something like: 在所有的例子中,我发现人们正在做的事情如下:

Multimaps.newSetMultimap(
TDecorators.wrap(new TIntObjectHashMap<Collection<Integer>>()),
new Supplier<Set<Integer>>() {
public Set<Integer> get() {
  return TDecorators.wrap(new TIntHashSet());
}
});

which works for a Multimap <Integer,Integer> , is it possible to use Trove to wrap a <String,String> ? 适用于Multimap <Integer,Integer> ,是否可以使用Trove来包装<String,String>

Incase anyone is interested in the future I went with http://code.google.com/p/jdbm2/ to write the hash map to the filesystem. 任何人都对未来感兴趣我使用http://code.google.com/p/jdbm2/将哈希映射写入文件系统。

Guava's Multimaps are backed by standard JDK Collections which aren't optimized for memory usage. Guava的Multimaps支持标准JDK集合,这些集合未针对内存使用进行优化。 For example, ArrayListMultimap<K, V> is backed by HashMap<K, ArrayList<V>> and HashMultimap<K, V> is backed by HashMap<K, HashSet<V>> . 例如, ArrayListMultimap<K, V>HashMap<K, ArrayList<V>>HashMultimap<K, V>HashMap<K, HashSet<V>>

Eclipse Collections (formerly GS Collections ) has Multimaps backed by its own container types, UnifiedMap and UnifiedSet . Eclipse Collections (以前称为GS Collections )具有由其自己的容器类型UnifiedMapUnifiedSet支持的Multimaps。 UnifiedMap uses half the memory of HashMap and UnifiedSet uses a quarter the memory of HashSet . UnifiedMap使用HashMap一半内存,而UnifiedSet使用HashSet的四分之一内存。 The memory benefits you'll see will depend on whether you use a FastListMultimap or a UnifiedSetMultimap . 您将看到的内存优势取决于您使用的是FastListMultimap还是UnifiedSetMultimap

More detailed memory comparisons are available here . 这里有更详细的内存比较。

Note: I am a committer for Eclipse Collections. 注意:我是Eclipse Collections的提交者。

You could look at memory efficient variant of hash maps, such as this one: https://code.google.com/p/sparsehash/ 您可以查看内存有效的哈希映射变体,例如: https//code.google.com/p/sparsehash/

If your value strings are long enough, compression could be an option. 如果您的值字符串足够长,则可以选择压缩。 You could also look into disk backed solutions such as Ehcache, depending on your access statistics. 您还可以查看磁盘支持的解决方案,例如Ehcache,具体取决于您的访问统计信息。

Trove4j doesn't contain hashmap for string-to-string. Trove4j不包含string-to-string的hashmap。

See http://trove4j.sourceforge.net/javadocs/gnu/trove/map/hash/package-summary.html http://trove4j.sourceforge.net/javadocs/gnu/trove/map/hash/package-summary.html

An approach I use is to use Map<String,Collection<String>> where the values start out as ArrayList<String> and get promoted to HashSet<String> when the bucket hits some threshold, say 32 elements. 我使用的方法是使用Map<String,Collection<String>> ,其中值从ArrayList<String>开始,并在桶达到某个阈值(例如32个元素)时被提升为HashSet<String>

I have found this saves a lot of memory for small buckets. 我发现这为小型存储桶节省了大量内存。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM