简体   繁体   中英

ConcurrentQueue and Parallel.ForEach

I have a ConcurrentQueue with a list of URLs that I need to get the the source of. When using the Parallel.ForEach with the ConcurrentQueue object as the input parameter, the Pop method won't work nothing (Should return a string).

I'm using Parallel with the MaxDegreeOfParallelism set to four. I really need to block the number of concurrent threads. Is using a queue with Parallelism redundant?

Thanks in advance.

// On the main class
var items = await engine.FetchPageWithNumberItems(result);
// Enqueue List of items
itemQueue.EnqueueList(items);
var crawl = Task.Run(() => { engine.CrawlItems(itemQueue); });

// On the Engine class
public void CrawlItems(ItemQueue itemQueue)
{
Parallel.ForEach(
            itemQueue,
            new ParallelOptions {MaxDegreeOfParallelism = 4},
            item =>
            {

                var worker = new Worker();
                // Pop doesn't return anything
                worker.Url = itemQueue.Pop();
                /* Some work */
             });
 }

// Item Queue
class ItemQueue : ConcurrentQueue<string>
    {
        private ConcurrentQueue<string> queue = new ConcurrentQueue<string>();

        public string Pop()
        {
            string value = String.Empty;
            if(this.queue.Count == 0)
                throw new Exception();
            this.queue.TryDequeue(out value);
            return value;
        }

        public void Push(string item)
        {
            this.queue.Enqueue(item);
        }

        public void EnqueueList(List<string> list)
        {
            list.ForEach(this.queue.Enqueue);
        }
    }

You don't need ConcurrentQueue<T> if all you're going to do is to first add items to it from a single thread and then iterate it in Parallel.ForEach() . A normal List<T> would be enough for that.

Also, your implementation of ItemQueue is very suspicious:

  • It inherits from ConcurrentQueue<string> and also contains another ConcurrentQueue<string> . That doesn't make much sense, is confusing and inefficient.

  • The methods on ConcurrentQueue<T> were designed very carefully to be thread-safe. Your Pop() isn't thread-safe. What could happen is that you check Count , notice it's 1, then call TryDequeue() and not get any value (ie value will be null ), because another thread removed the item from the queue in the time between the two calls.

The issue is with CrawlItems method, since you shouldn't call Pop in the action provided to the ForEach method. The reason is that the action is being called on each popped item, hence the item was already popped. This is the reason that the action has an 'item' argument.

I assume that you're getting null since all of the items already popped by the other threads, by the ForEach method.

Therefore, your code should look like this:

public void CrawlItems(ItemQueue itemQueue)
{
    Parallel.ForEach(
        itemQueue,
        new ParallelOptions {MaxDegreeOfParallelism = 4},
        item =>
        {
            worker.Url = item;
            /* Some work */
         });
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM