简体   繁体   中英

Python PriorityQueue items of same priority got in random order

I used the queue.Queue class for passing tasks from one thread to the other. Later I needed to add priority so I changed it to PriorityQueue , using the proposed PrioritizedItem (because the tasks are dict and cannot be compared). Then, in rare situation, it started cause task mixup. It took me a while to realise/debug that same priority items in the PriorityQueue do not keep the insertion order or, even worse from debugging point of view, usually they do.

I guess, FIFO is a sort of default, when talking about task queues. This is why the Queue is not called like FifoQueue , isn't it? So, PriorityQueue should explicitly state that it is not FIFO for equal-priority items. Unfortunately, the Python doc does not warn us about this, and that lack of warning caused headache for me, and probably others too.

I have not found any ready-made solution, but I am pretty sure others may need a PriorityQueue that keeps the insertion order for equal-priority items. Hence this ticket...

Besides I hope the Python doc will state the warning in some next release, let me share how I solved the problem.

heapq (used by PriorityQueue ) proposes that we need to insert a sequence number into the compared section of the item so that the calculated priority is obvious and avoid having 2 items with the same priority.

I also added the threading.Lock so that we avoid having the same sequence number for 2 items just because some thread racing situation occurred.

class _ThreadSafeCounter(object):
    def __init__(self, start=0):
        self.countergen = itertools.count(start)
        self.lock = threading.Lock()
    def __call__(self):
        with self.lock:
            return self.countergen.__next__()

#create a function that provides incremental sequence numbers
_getnextseqnum = _ThreadSafeCounter()

@dataclasses.dataclass(order=True)
class PriorityQueueItem:
    """Container for priority queue items
    
    The payload of the item is stored in the optional "data" (None by default), and
    can be of any type, even such that cannot be compared, e.g. dict.
    
    The queue priority is defined mainly by the optional "priority" argument (10 by
    default).
    If there are more items with the same "priority", their put-order is preserved,
    because of the automatically increasing sequence number, "_seqnum".
    Usage in the producer:
        pq.put(PriorityQueueItem("Best-effort-task",100))
        pq.put(PriorityQueueItem(dict(b=2))
        pq.put(PriorityQueueItem(priority=0))
        pq.put(PriorityQueueItem(dict(a=1))
    Consumer is to get the tasks with pq.get().getdata(), and will actually receive
        None
        {'b':2}
        {'a':1}
        "Best-effort-task"
    """
    data: typing.Any=dataclasses.field(default=None, compare=False)
    priority: int=10
    _seqnum: int=dataclasses.field(default_factory=_getnextseqnum, init=False)
    
    def getdata(self):
        """Get the payload of the item in the consumer thread"""
        return self.data

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM