简体   繁体   English

使用并行循环给每个数据行一个分数 C#

[英]Using Parallel loop to give each datarow a score in C#

I have a method which iterates through each DataRow of a DataTable then sets the final "Score" column to be a result.我有一种方法可以遍历 DataTable 的每个 DataRow,然后将最终的“Score”列设置为结果。 I am trying to figure out how to do this quicker than simply evaluating in serial.我想弄清楚如何比简单地连续评估更快地做到这一点。 I am new to trying parallel loops and don't know if I am taking the wrong approach.我是尝试并行循环的新手,不知道我是否采用了错误的方法。

Simplified existing code:简化现有代码:

foreach (DataRow dr in DateOptions.Rows) 
{ 
double score = evalRow(dr);
 dr["score"] = score; 
}

Using the following seems to result in error- as I am trying to modify the DataTable.使用以下内容似乎会导致错误 - 因为我正在尝试修改 DataTable。

Parallel.ForEach(DateOptions.AsEnumerable(), dr =>
{
    double score = evalRow(dr);
     dr["score"] = score; 
});

Is there some way I am not thinking of to extract the result then apply the value to the appropriate column?有没有什么方法我没有想到提取结果然后将值应用于适当的列?

It probably depends largely on whether evalRow does anything that's not thread-safe.这可能在很大程度上取决于evalRow是否执行任何非线程安全的操作。 If the only problem is coming from modifying the DataTable, then you can likely fix it by applying Command-Query Separation: use parallel processing to figure out what to do, then drop back to serial processing to actually do it.如果唯一的问题来自于修改数据表,那么您可能可以通过应用命令查询分离来解决它:使用并行处理来确定要做什么,然后回到串行处理来实际执行它。

var rowsWithScores = DateOptions.AsEnumerable().AsParallel()
    .Select(dr => new {dr, score = evalRow(dr)})
    .ToList();
foreach(var rowWithScore in rowsWithScores)
{
    rowWithScore.dr["score"] = rowWithScore.score;
}

That said, in my experience problems like this are better solved with algorithmic fixes rather than just trying to throw parallel processing at it.也就是说,根据我的经验,像这样的问题最好通过算法修复来解决,而不是仅仅尝试对其进行并行处理。 If you're only dealing with thousands of items, and this is taking hours to complete, that tells me you're probably either using an algorithm with high complexity (which can probably be fixed using data structures), or doing a lot of I/O (which might lend itself to concurrent asynchronous operations).如果你只处理数千个项目,而这需要几个小时才能完成,这告诉我你可能正在使用一种高度复杂的算法(可能可以使用数据结构来修复),或者做了很多我/O(可能适用于并发异步操作)。 In other words, there's probably another approach that will get you orders of magnitude better performance.换句话说,可能还有另一种方法可以使您的性能提高几个数量级。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM