简体   繁体   English

创建在Java中按创建/更新时间排序的有序HashMap

[英]Creating a ordered HashMap sorted by create/update time in Java

I am working on data tuples in the following format: [IP, number-of-bytes-served, time]. 我正在以下列格式处理数据元组:[IP,字节数,服务时间]。 I created a HashMap on IP to count the number of bytes served for each IP. 我在IP上创建了一个HashMap来计算每个IP服务的字节数。 Then, I realised that I need to remove some least recently used key-value pairs to create some more space. 然后,我意识到我需要删除一些最近最少使用的键值对来创建更多空间。 I want to create a time constraint, let's say 1 hour, and remove the key-value pairs with no action in that period. 我想创建一个时间约束,比方说1小时,并在该时间段内删除没有操作的键值对。 So I need to save the update time for each pair. 所以我需要保存每对的更新时间。 In fact, for a good performance having the pairs sorted by timestamps seems reasonable. 实际上,对于具有按时间戳排序的对的良好性能似乎是合理的。

Thus, what I would like to do is to maintain a sorted list based on the creation or update time of key-value pairs. 因此,我想要做的是根据键值对的创建或更新时间维护一个排序列表。 I need to know these creation and update times explicitly. 我需要明确地了解这些创建和更新时间。 I came up with two different ideas, but now exactly sure which one to use and how. 我提出了两个不同的想法,但现在确切地确定使用哪个以及如何使用。 Here are my two ideas: 这是我的两个想法:

  • I need a LinkedList with head pointing to the timestamp for the recently updated key-value pair and have this key-value pairs point list nodes. 我需要一个LinkedList,其头部指向最近更新的键值对的时间戳,并具有此键 - 值对点列表节点。
  • I need to maintain HashMap in sorted order based on their creation/update time. 我需要根据创建/更新时间按排序顺序维护HashMap。 Maybe I need to change value from integer to Object with the integer value and a long indicating timestamp. 也许我需要使用整数值和长指示时间戳将值从整数更改为Object。

And the question is how to implement these in Java for efficient add/delete/get performance? 问题是如何在Java中实现这些以实现高效的添加/删除/获取性能? Or which libraries I can use to get a HashMap sorted by creation/update time? 或者我可以使用哪些库来获取按创建/更新时间排序的HashMap?

Offering my two cents... I recently did something similar (I didn't know about LinkedHashMap at the time) where I needed to keep track of some session info. 提供我的两美分...我最近做了类似的事情(我当时不知道LinkedHashMap ),我需要跟踪一些会话信息。

I ended up using a ConcurrentHashMap because multiple user sessions can be active at a time, and run a cleanup every 30 minutes to clear stale session data. 我最终使用了ConcurrentHashMap因为一次可以激活多个用户会话,并且每30分钟运行一次清理以清除过时的会话数据。 My thought process was, since the app is going to need snappier performance for dealing with session data, I kept the session id as the key. 我的思维过程是,因为应用程序需要更快的性能来处理会话数据,我将会话ID保持为关键。 When I need to clear the oldest data, just grab a list of the values and sort (provided that the class implements Comparable or you can write a Comparator for it) because this is not done "that often". 当我需要清除最旧的数据时,只需获取值列表并进行排序(假设该类实现了Comparable或者您可以为其编写Comparator ),因为这不是“经常”。

Hope that helps. 希望有所帮助。

PS. PS。 I'm curious how this compares to the LinkedHashMap implementation? 我很好奇这与LinkedHashMap实现相比如何?

This is a case where LinkedHashMap is appropriate, overriding the removeEldestEntry() method. 这是LinkedHashMap合适的情况,它覆盖了removeEldestEntry()方法。 The key of this map will be the IP address, the value will be a tuple of (bytes, last_update) . 该映射的关键是IP地址,该值将是(bytes, last_update)的元组。

First, you need to create your map using with "access order": this means that any access to a map entry will move that entry to the end of the list (MRU). 首先,您需要使用“访问顺序”创建地图:这意味着对地图条目的任何访问都会将该条目移动到列表末尾(MRU)。 Then, do the following when you get a new record: 然后,在获得新记录时执行以下操作:

  • If map does not contain an entry for an IP, create a new one with number of bytes. 如果map不包含IP条目,请创建一个具有字节数的新条目。
  • If map does contain an entry, get the number of bytes from it, and create a new entry by adding the new bytes to old. 如果map确实包含一个条目,则从中获取字节数, 通过将新字节添加到old来创建一个新条目 Then put() the new entry into the map, replacing the old. 然后put()将新条目放入地图中,替换旧的。

Your tuple should automatically set the time field to current time. 您的元组应自动将时间字段设置为当前时间。 But, the thing to understand is that you don't really care about the time, it's simply an attribute used to remove items from the list. 但是,需要理解的是,您并不真正关心时间,它只是用于从列表中删除项目的属性。

Override removeEldestEntry() , and return true if the time is outside your bounds. 覆盖removeEldestEntry() ,如果时间超出边界,则返回true

Although, to be honest I think you'll be better off using a size-based eviction strategy (limit your map to a fixed number of entries). 虽然,老实说,我认为你最好使用基于尺寸的驱逐策略(将地图限制为固定数量的条目)。 A time-based strategy opens you up to a DDOS attack, in which case you have a large number of entries that come in at once, exhausting your memory. 基于时间的策略会让您受到DDOS攻击,在这种情况下,您会有大量条目同时进入,耗尽您的记忆。

To sort by Creation/Update time, you'll need to have that time to do the comparation. 要按创建/更新时间排序,您需要有时间进行比较。 That means your object will have to know when was it created/updated. 这意味着您的对象必须知道何时创建/更新。 This can be accomplished relatively easy by having a version field set by default when the object is created and set with a new Date() when the object is updated. 通过在创建对象时默认设置version字段并在更新对象时使用new Date()设置,可以相对容易地完成此操作。

There are structures you can base yourself in ( TreeSet and TreeMap , in particular) who implement an order defined by the objects themselves ( Comparable interface) or a Comparator . 您可以根据自己的结构(特别是TreeSetTreeMap )实现由对象本身( Comparable接口)或Comparator定义的顺序。 If you store items that save the date when they were created ir updated, you can implement a comparator that can aid in the sorting process. 如果存储保存创建日期的项目并更新,则可以实施可帮助进行排序过程的比较器。

If you're restricted to LinkedList and HashMap , you'll have to sort the list by the use of, for example, Collections#Sort . 如果您被限制为LinkedListHashMap ,则必须使用例如Collections#Sort对列表进行Collections#Sort In the case of the HashMap , you'll have to sort its Entry Set but since you can't modify it, you'll have to generate a new sorted map this way. 对于HashMap ,您必须对其条目集进行排序,但由于您无法对其进行修改,因此您必须以这种方式生成新的有序映射。

Still, a HashMap is a structure that has nothing to do with ordering, so you'll still have some issues when iterating through it. 尽管如此, HashMap是一种与排序无关的结构,因此在迭代它时仍然会遇到一些问题。 A LinkedHashMap could solve this, but again, it all depends on your data type restrictions. LinkedHashMap可以解决这个问题,但同样,这完全取决于您的数据类型限制。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM