[英]Most efficient way to compare data
In our application, there is a periodically called function from which previous call results need to be compared to current call results as below 在我们的应用程序中,有一个定期调用的函数,需要将以前的调用结果与当前的调用结果进行比较,如下所示
public class Record
{
public long Id{get;set}
public bool SwitchStatus{get;set;}
//.....Other Fields.....
}
public class Consumer
{
private List<Tuple<long, bool>> _sortedRecordIdAndStatus = new List<Tuple<long, bool>>();
void IGetCalledEveryThreeSeconds(List<Record> records)
{
var currentsortedRecordIdAndStatus = records.Select(x=> new Tuple<long, bool> (x.Id, x.SwitchStatus)).ToList();
if(!currentsortedRecordIdAndStatus.SequenceEqual(_sortedRecordIdAndStatus))
{
DoSomething();
}
_sortedRecordIdAndStatus = currentsortedRecordIdAndStatus;
}
} }
The ToList() function takes a lot of time when the function is called with thousands of records. 调用具有数千条记录的函数时,ToList()函数会花费大量时间。 That is currently the bottleneck. 这是当前的瓶颈。
I am trying to optimize this routine. 我正在尝试优化此例程。 All I need is to compare is a block of data is same or not 我需要比较的是一个数据块是否相同
I think I just need create a block of data from the incoming records and compare the block with the next call block created and so on....All I need to know is if the block is same(that is including order). 我想我只需要从传入记录中创建一个数据块,并将该块与创建的下一个调用块进行比较,依此类推...。我只需要知道该块是否相同(包括顺序)即可。 I don't even need to look into the data inside 我什至不需要查看里面的数据
Eg. 例如。 for the content of block 对于块的内容
[[1000][true]]
[[2000][false]]
[[1500][true]]
Is there any way to optimize my code? 有什么方法可以优化我的代码?
This comes with the standard caveat that premature optimization is the root of all evil, and I'm just going to trust your statement that this is really a bottleneck in your application's performance. 这带有标准的警告,即过早的优化是万恶之源,我只是相信您的声明,这确实是应用程序性能的瓶颈。
Something like this: 像这样:
public class Consumer
{
private List<Record> _previousRecords = new List<Record>();
void IGetCalledEveryThreeSeconds(List<Record> records)
{
if(records.Count == _previousRecords.Count
&& records.Select(x => (x.Id, x.SwitchStatus)).SequenceEqual(
_previousRecords.Select(x => x.Id, x.SwitchStatus))
{
DoSomething();
}
_previousRecords = records;
}
However, considering your comments that the inputs are usually the same, I don't know if these optimizations will even be beneficial. 但是,考虑到您的意见,即输入通常是相同的,我不知道这些优化是否会有所帮助。 Since you pretty much have to iterate over the entire list to verify that they're different, regardless, these optimizations won't improve things by an order of magnitude. 由于您几乎必须遍历整个列表以验证它们是否不同,因此无论如何,这些优化都无法将性能提高一个数量级。 And it's hard to know whether avoiding the creation of a new List each time will offset the overhead of selecting new Tuples from _previousRecords each time. 而且很难知道是否每次都避免创建新的List是否会抵消每次从_previousRecords中选择新的元组的开销。
If you really need to squeeze every ounce of performance out of this, and you're positive this is the bottleneck, and you can't come up with a broader architectural solution that avoids this bottleneck in the first place, your last best option is probably to avoid LINQ and go with a for
loop. 如果您真的需要从中榨取每一分钱的性能,并且您肯定这是瓶颈,并且您无法想出一个能够避免这一瓶颈的更广泛的体系结构解决方案,那么您的最佳选择是可能是为了避免LINQ并使用for
循环。 But the improvements probably won't be significant enough to make a business-level difference. 但是这些改进可能不足以使业务水平发生重大变化。
public class Consumer
{
private List<Record> _previousRecords = new List<Record>();
void IGetCalledEveryThreeSeconds(List<Record> records)
{
var length = records.Count;
if(length != _previousRecords.Count)
{
return;
}
for(int i = 0; i < length; i++)
{
var record1 = records[i];
var record2 = _previousRecords[i];
if(record1.Id != record2.Id || record1.SwitchStatus != record2.SwitchStatus)
{
_previousRecords = records;
return;
}
}
DoSomething();
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.