简体   繁体   English

此hashmap场景的数据结构

[英]which datastructure for this hashmap scenario

I have a scenario where i store values in a hashmap. 我有一个场景,我将值存储在hashmap中。

Keys are strings like 键是字符串

fruits
fruits_citrus_orange
fruits_citrus_lemon
fruits_fleshly_apple
fruits_fleshly
fruits_dry

and so on. 等等。

Values are some objects. 值是一些对象。 Now for a given input say fruits_fleshly i need to retrieve all cases where it starts with "fruits_fleshly" In the above case I need to fetch 现在对于给定的输入说fruits_fleshly我需要检索它以“fruits_fleshly”开头的所有情况在上面的例子中我需要获取

fruits_fleshly_apple
fruits_fleshly

One way to do this is by doing String.indexOf over all the keys. 一种方法是在所有键上执行String.indexOf。 Is there any other effective way to do this instead of iterating over all the keys in a map 有没有其他有效的方法来做到这一点,而不是迭代地图中的所有键

Iterating the map seems quite simple and straight-forward way of doing this. 迭代地图似乎是非常简单和直截了当的方式。 However, since you don't want to iterate over keys on your own, you can use Guava's Maps#filterEntries , if you are ok with using 3rd party library. 但是,由于您不想自己迭代键,因此如果您可以使用第三方库,则可以使用Guava的 Maps#filterEntries

Here's how it would work: 以下是它的工作原理:

Map<String, Object> = Maps.filterEntries(
                   yourMap, 
                   Predicate.containsPattern("^fruits_fleshly"));

But, that would too iterate over the map in the backyard. 但是,那也会在后院的地图上迭代。 So, iteration is still there, if you are bothered about efficiency. 因此,如果您对效率感到困扰,迭代仍然存在。

though these are strings, but to me, it looks like these are certain categories & sub categories, like fruit, fruit-freshly, fruit-citrus etc.. 虽然这些是字符串,但对我来说,看起来这些是某些类别和子类别,如水果,新鲜水果,水果柑橘等。

If that is a case you can instead implement a Tree data-structure. 如果是这种情况,您可以改为实现树数据结构。 This would be most effective for search operation. 这对搜索操作最有效。

since Tree has a parent-child structure, there is a root node & child node. 由于Tree具有父子结构,因此存在根节点和子节点。 You can have a structure like this: 你可以有这样的结构:

(0)   (1)        (2)
fruit
|_____citrus
|          |_____lemon
|          |_____orange
|
|_____freshly
           |_____apple
           |_____

in this structure, say if you want to search for citrus fruit, you can just go to citrus, and list all its child. 在这种结构中,如果你想搜索柑橘类水果,你可以去柑橘,并列出它的所有孩子。 And finally you can construct full name by concatenating the name as a path from root to leaves. 最后,您可以通过将名称连接为从根到叶的路径来构造全名。

Since HashMap doesn't maintain any order for its keys it's not a very good choice for this problem. 由于HashMap没有维护其键的任何顺序,因此对于此问题不是一个非常好的选择。 A better choice is the TreeMap: it has methods for retrieving a sub map for a range of keys. 更好的选择是TreeMap:它具有检索一系列键的子映射的方法。 These methods run in O(log n) time (n number of entries) so it's better than iterating over the keys. 这些方法在O(log n)时间(n个条目)中运行,因此它比迭代密钥更好。

Map subMap = myMap.subMap("fruits_fleshly", true, "fruits_fleshly\uffff", true);

The nature of a hashmap means that there's no way to do a "like" comparison on keys - you have to iterate over them all to find where key.startsWith(input) . hashmap的本质意味着没有办法对键进行“喜欢”的比较 - 你必须遍历它们才能找到key.startsWith(input)

I suppose you could nest hashmaps and split up your keys. 我想你可以嵌套哈希映射并拆分你的密钥。 Eg, 例如,

{
  "fruits":{
    "citrus":{
      "orange":(value), 
      "lemon":(value)
    }, 
    "fleshly":{
      "apple":(value), 
      "":(value)
    }
  }
}

...etc. ...等等。

The performance implications are probably horrific on a small scale, but that may not matter in a homework context but maybe not so bad if you're dealing with a lot of data and only a couple layers of nesting. 性能影响可能在小范围内可怕,但在家庭作业环境中可能无关紧要,但如果您处理大量数据并且只有几层嵌套,则可能并不那么糟糕。

Alternatively, create a Category object with a List of Categories (sub-categories) and a List of entries. 或者,使用List of Categories(子类别)和条目列表创建Category对象。

I believe Radix Trie is what you are looking for. 我相信Radix Trie正是您所寻找的。 It is similar idea as @ay89 solution. 它与@ ay89解决方案类似。

You can just use this open source library Radix Trie example . 您可以使用此开源库Radix Trie示例 It perform better than O(log(N)). 它的性能优于O(log(N))。 You will be able to find a hashmap assigned to a key in average constant time (number of underscores in your search key string) with a decent implementation of Radix Trie.fruits fruits_citrus_orange fruits_citrus_lemon fruits_fleshly_apple fruits_fleshly fruits_dry 您将能够找到一个平均恒定时间(搜索关键字字符串中的下划线数)分配给键的哈希映射,并使用Radix Trie.fruits的一个不错的实现.fruit_citrus_orange fruits_citrus_lemon fruits_fleshly_apple fruits_fleshly fruits_dry

Trie<String, Map> trie = new PatriciaTrie<>;
trie.put("fruits", hashmap1);
trie.put("fruits_citrus_orange", hashmap2);
trie.put("fruits_citrus_lemon", hashmap3);
trie.put("fruits_fleshly_apple", hashmap4);
trie.put("fruits_fleshly", hashmap5);

Map.Entry<String, Map> entry = trie.select("fruits_fleshy");

If you just want one hashmap to be return by select you might be able to get slightly better performance if you implement your own Radix Trie. 如果你只想通过select返回一个hashmap,那么如果你实现自己的Radix Trie,你可能会获得更好的性能。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM