简体   繁体   中英

Data structure for FIFO behaviour and fast lookup by value

So I am looking for a data structure which needs a FIFO behaviour but should also have a quick look up time by value.

In my current code I have some data duplication. I use a std::unordered_set and std::queue for achieving the behaviour I want but there's probably a better way of achieving this that I'm not thinking of at the moment. I have a function that adds my new entry to both the set and the queue when a new entry comes up. To search if an entry exists in the queue I use find() in the set. Laslty, I have a timer that is set off after an insertion to the queue. After a minute I get the entry in the front of the queue with queue.front(), then I use this value to erase from the set, and finally I do a pop on the queue.

This all works as expected and gives me both the FIFO behaviour and the constant time complexity for the look up but I have data duplication and I was wondering if there is a data structure (maybe something form boost?) which does what I want without the duplication.

Data structure for FIFO behaviour and fast lookup by value

A solution is to use two containers: Store the elements in an unordered set for fast lookup, and upon insertion, store iterator to the element in a queue. When you pop the queue, erase the corresponding element from the set.

A more structured approach is to use a multi-index container. The standard library doesn't provide such, but boost does. More specifically, you could use a combination of hashed and sequence indices.

This answer is mostly concerning corner cases of the problem as presented

If you problem is a practical one, and you are able store the elements with a std::vector - and if you have less than in the ballpark of some ~10-100 elements in the queue, then you could just use:

std::queue<T, std::vector<T> > q;

That is a queue using vector as the underlying container. When you have that small number of elements (only 10-100) then using advanced lookup methods is not worth it.

You then only needs to check for duplicates when you pop the queue not on every insertion. Again, that might or might not be usefull depending on your specific case. I can imagine cases where this method is superior. Eg. a webserver serving pages that gets a lot of hits to just one or a few pages. Then it might be faster to just add say 100,000 elements to the vector and then go and remove the duplicates all in one go when popping.

How about defining your own data structure which can act as a BST (for lookups) and as a min heap which you can use to impose fifo?

class node {
    public:
    static int autoIncrement = 0;
    
    int order; // this will be auto-incremented to impose FIFO
    int data;
    
    node* left_Bst;
    node* right_Bst;

    node* left_Heap;
    node* right_Heap; 

    node() {
        order = autoIncrement;
        autoIncrement++;
    }
}

By doing this you are basically creating two data structures sharing the same nodes. BST's partial order is imposed via data, and heap's can be maintained via order variable.

During an insertion you can traverse via BST pointers and insert your element if it doesn't exist already and also modify the heap pointers accordingly after insertion.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM