I need an efficient data structure to store a big number (millions) of records on a live (up to a hundred insertions, deletions or updates per second) server.
Its clients need to be able to grab a chunk of that data, sorted, beginning from some point, be able to scroll (ie get records before and after the ones they initially got) and receive live updates.
Initially I considered some form of a linked ordered set with some index, however even though the records are unique in the sense that they have an id, the values of their fields by which the set would be ordered are not. Could resolve collisions by just inserting more than one record into each node, but does not seem right.
The other solution I came up with is a linked set with an index, which is kept sorted through insertion deletion and updates. Big O of that would be not O(log n) but O(n), but I'm guessing if I still have the index, would it speed up the process a lot? Or binary search the place to insert? Do not think I can with the list though.
What would be the most efficient solution and which one is best given that I need clients to receive live updates on the state of this data structure?
The code will be in Java
Millions of records -> First estimate if you want / can hold all the data in RAM.
Have a look at b-tree .
Algorithm | Average | Worst case |
---|---|---|
Space | O(n) | O(n) |
Search | O(log n) | O(log n) |
Insert | O(log n) | O(log n) |
Delete | O(log n) | O(log n) |
In Java
these kinds of requirements are usually solved by using a TreeMap
like a database index. The TreeMap
interface isn't particularly well designed for this, so there are some tricks to it:
Key
interface or base class that just exposes the sort fields and ID. This interface should not extend Comparable
.TreeMap<Key,Record>
. Remember that every put
should be of the form put(record,record)
TreeMap
, use the constructor that takes a custom comparator. Pass a comparator that compares Key
s using the sort fields AND the ID, so that there will be no duplicates .Key
interface -- you don't have to use complete records. Because a caller can't provide an ID, though, you can't use TreeMap.get()
to find a record that matches the sort fields. Use a key with ID=0 and TreeMap.ceilingEntry
to get the first record with >= key, and then check the sort fields to see if they match.Note that if you need multiple orderings on different fields, you can make your records implement multiple Key interfaces and put them in multiple maps.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.