简体   繁体   English

HashMap的两个实例之间的区别 <String, String> 和HashMap的实例 <String,Object> 其中对象包含两个字符串

[英]Difference between two instances of HashMap<String, String> and an instance of HashMap<String,Object> where Object contains the two Strings

For simplicity's sake, let's say I have two instances of HashMap<String, String> where they share the same keys. 为简单起见,假设我有两个HashMap<String, String>实例,它们共享相同的键。 What I want to know is there a performance and memory difference between that and representing those two String values in an Object and storing them in HashMap<String, Object> . 我想知道的是,在Object表示这两个String值并将它们存储在HashMap<String, Object>之间在性能和内存上存在差异。

My actual problem uses an instance of HashMap<String, HashSet<String>> and two instances of HashMap<String, Double> and I was hoping that by consolidating them, I'd save memory somehow, but I wasn't sure if there'd be a performance impact from using self-defined Object versus a native object like HashSet or Double as values. 我的实际问题是使用一个HashMap<String, HashSet<String>>实例和两个HashMap<String, Double>实例,我希望通过合并它们,以某种方式节省内存,但是我不确定是否存在使用自定义Object与诸如HashSetDouble这样的本机对象作为值会对性能产生影响。

The hashes are calculated from the strings, so there won't be a speed impact. 哈希是根据字符串计算得出的,因此不会对速度造成影响。 The space impact (a very slight increase) will be negligible in the long run. 从长远来看,对空间的影响(很小的增加)可以忽略不计。 If it makes the code more legible, do it and screw the performance (as long as we're not talking huge performance bottlenecks, it doesn't matter). 如果它使代码更清晰易懂,则可以这样做并提高性能(只要我们没有在谈论巨大的性能瓶颈,就没关系)。

Speed 速度

For the speed impact, remember that in a HashMap<String, ?> , the String is what gets hashed. 为了提高速度,请记住,在HashMap<String, ?>String是要进行哈希处理的东西。 You may see a small increase in speed, actually, since you'll only have to do one lookup to find your custom object compared to 3 lookups. 实际上,您可能会看到速度有所提高,因为与3次查找相比,您只需要做一次查找即可找到您的自定义对象。

Space 空间

For space impact, remember that HashMap uses an inner array that's the size of a power of 2. If you're using just a vanilla HashMap with no special settings like a custom load factor, then you may see a slight space increase because right now you have (roughly, of course, this is just simplified): 对于空间影响,请记住, HashMap使用的内部数组的大小为2的幂。如果您仅使用香草HashMap ,没有像自定义加载因子这样的特殊设置,那么您可能会看到空间略有增加,因为现在您有(当然,这只是简化了):

HashSet<String>[]
Double[]
Double[]

And after combining it, you'll have 合并后,您将拥有

CustomObject[]
    HashSet<String>
    Double
    Double

This is ignoring constant-size information that doesn't grow with the map. 这将忽略不随地图增长的恒定大小的信息。 Objects take up more space than just the references to their fields, but not very much. 对象占用的空间比对它们的字段的引用要多,但不是很多。

Legibility 易读性

The custom object option wins this one hands-down. 自定义对象选项将赢得这一殊荣。 It's way cleaner and is very OOP, fitting into Java very, very well. 它更干净,而且非常面向对象,非常非常适合Java。 Regardless of the performance, you should do this anyway. 无论性能如何,您都应该这样做。 It'll look better in the long run and be more maintainable. 从长远来看,它将看起来更好,并且更易于维护。

For example, if you wanted to add fields to the custom object, that's easy. 例如,如果您想向自定义对象添加字段,那很简单。 But having the separate maps means creating more maps for more variables, which is dirty. 但是拥有单独的映射意味着为更多的变量创建更多的映射,这很脏。 I say go the OOP way. 我说走面向对象的方式。

If you want to combine them, create a class to represent them and have a map of those: 如果要组合它们,请创建一个类来表示它们并具有它们的映射:

public class Stuff {
    String a;
    String b;
    // other fields - maybe the double you mentioned
}

HashMap<String, MyStuff> map;

That will definitely save memory because you'll have less map entries. 这肯定会节省内存,因为您的地图条目会更少。

However, it's the right way to do it anyway. 但是,无论如何,这是正确的方法。 Design time is not the time to worry about tiny performance and memory impacts. 设计时间不是担心微小的性能和内存影响的时间。 Make your code easy to read and use and your life (and code) will ne better. 使您的代码易于阅读和使用,您的生活(和代码)会更好。

Unless you are dealing with a very large number of entries, the difference (if any) is likely to be unimportant. 除非您要处理大量条目,否则差异(如果有)可能不重要。 It will also be intimately tied to the platform (JVM version, Java library version, etc) and so your only useful answer will come from a profiler run against each different way of doing it. 它还将与平台(JVM版本,Java库版本等)紧密地联系在一起,因此,您唯一有用的答案将是针对每种不同的运行方式运行的探查器。

You might consider looking at the Guava Multimap . 您可以考虑查看Guava Multimap It might be a cleaner solution to your problem. 它可能是解决您问题的更干净的方法。 Failing that, I would use the custom object. 如果失败,我将使用自定义对象。 Code clarity wins against premature optimization every time. 每次代码清晰都反对过早的优化。

You're probably better off in many ways by combining the tables. 通过合并表,可能会在许多方面带来更好的收益。 Speedwise, you only have to compute the hash and walk the hash chains once instead of twice. 快速地,您只需要计算哈希值并遍历哈希链一次即可,而不是两次。 For space, you need only one hash table instead of two (saving about 8 bytes per entry if your load factor is around 50%) and one hash chain instead of two (saving a hash chain object at about 16 bytes per entry). 对于空间,您只需要一个哈希表而不是两个哈希表(如果负载系数约为50%,则每个条目可节省约8个字节)和一个哈希链而不是两个(每个哈希表对象可保存约16个字节)。 You pay the cost of your pair object, but that's at most 16 bytes. 您要支付对对象的费用,但这最多为16个字节。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM