简体   繁体   English

LINQ Parallel.ForEach <>循环上的lock关键字

[英]lock keyword on a LINQ Parallel.ForEach<> loop

This is more a conceptual question. 这更多是一个概念上的问题。 I was wondering if I used a lock inside of Parallel.ForEach<> loop if that would take away the benefits of Paralleling a foreach loop. 我想知道我是否在Parallel.ForEach<>循环中使用了lock ,这样是否可以消除并行foreach循环的好处。

Here is some sample code where I have seen it done. 这是一些我看过的示例代码。

Parallel.ForEach<KeyValuePair<string, XElement>>(binReferences.KeyValuePairs, reference =>
{
    lock (fileLockObject)
    {
        if (fileLocks.ContainsKey(reference.Key) == false)
        {
            fileLocks.Add(reference.Key, new object());
        }
    }

    RecursiveBinUpdate(reference.Value, testPath, reference.Key, maxRecursionCount, ref recursionCount);

    lock (fileLocks[reference.Key])
    {
        reference.Value.Document.Save(reference.Key);
    }
});

Where fileLockObject and fileLocks are as follows. 其中fileLockObjectfileLocks如下。

private static object fileLockObject = new object();
        private static Dictionary<string, object> fileLocks = new Dictionary<string, object>();

Does this technique completely make the loop not parallel? 这种技术是否会使循环不完全平行? I would like to see your thoughts on this. 我想看看你对此的想法。

It means all of the work inside of the lock can't be done in parallel. 这意味着lock内的所有工作无法并行完成。 This greatly harms the performance here, yes. 是的,这极大地损害了此处的性能。 Since the entire body is not all locked (and locked on the same object) there is still some parallelization here though. 由于整个主体并未全部锁定(并锁定在同一对象上),因此这里仍然存在一些并行化。 Whether the parallelization that you do get adds enough benefit to surpass the overhead that comes with managing the threads and synchronizing around the locks is something you really just need to test yourself with your specific data. 您确实获得的并行化是否能够带来足够的好处,以超过管理线程和围绕锁进行同步所带来的开销,这实际上只是您需要使用特定数据进行测试的地方。

That said, it looks like what you're doing (at least in the first locked block, which is the one I'd be more concerned with at every thread is locking on the same object) is locking access to a Dictionary . 也就是说,您正在执行的操作(至少在第一个锁定的块中,我将在每个线程上更关注的是锁定同一对象)正在锁定对Dictionary访问。 You can instead use a ConcurrentDictionary , which is specifically designed to be utilized from multiple threads, and will minimize the amount of synchronization that needs to be done. 您可以改为使用ConcurrentDictionary ,它专门设计用于多个线程,并且可以最大程度地减少需要完成的同步量。

if I used a lock ... if that would take away the benefits of Paralleling a foreachloop. 如果我使用了锁,则……将带走并存foreachloop的好处。

Proportionally. 按比例。 When RecursiveBinUpdate() is a big chunk of work (and independent) then it will still pay off. RecursiveBinUpdate()是一项繁重的工作(并且是独立的)时,它仍然会有所回报。 The locking part could be a less than 1%, or 99%. 锁定部分可以小于1%或99%。 Look up Amdahls law, that applies here. 查找适用于此的阿姆达尔定律。

But worse, your code is not thread-safe. 但更糟糕的是,您的代码不是线程安全的。 From your 2 operations on fileLocks , only the first is actually inside a lock. 从对fileLocks的2个操作中,实际上只有第一个锁内。

lock (fileLockObject)
{
    if (fileLocks.ContainsKey(reference.Key) == false)
    {
       ...
    }
}

and

lock (fileLocks[reference.Key])   // this access to fileLocks[] is not protected

change the 2nd part to: 将第二部分更改为:

lock (fileLockObject)
{        
    reference.Value.Document.Save(reference.Key);
}

and the use of ref recursionCount as a parameter looks suspicious too. 并且使用ref recursionCount作为参数看起来也很可疑。 It might work with Interlocked.Increment though. 它可能与Interlocked.Increment一起使用。

The "locked" portion of the loop will end up running serially. 循环的“锁定”部分将最终以串行方式运行。 If the RecursiveBinUpdate function is the bulk of the work, there may be some gain, but it would be better if you could figure out how to handle the lock generation in advance. 如果RecursiveBinUpdate函数是繁重的工作,则可能会有所收获,但是如果您能提前弄清楚如何处理锁生成,那就更好了。

When it comes to locks, there's no difference in the way PLINQ/TPL threads have to wait to gain access. 说到锁,PLINQ / TPL线程必须等待获取访问权限的方式没有什么不同。 So, in your case, it only makes the loop not parallel in those areas that you're locking and any work outside those locks is still going to execute in parallel (ie all the work in RecursiveBinUpdate ). 因此,在您的情况下,它只会使循环在您要锁定的区域中不并行,并且那些锁之外的任何工作仍将并行执行(即RecursiveBinUpdate所有工作)。

Bottom line, I see nothing substantially wrong with what you're doing here. 最重要的是,我认为您在这里所做的工作基本上没有错。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM