简体   繁体   English

通用列表上的Linq需要很多时间

[英]Linq on generic List takes a lot of time

I have a database table that gets 7 different values from 6 different measuring stations (that's 42 values) every second. 我有一个数据库表,它每秒从6个不同的测量站获取7个不同的值(即42个值)。 The data isn't stored with a timestamp, just a "TickNumber" and then the time is calculated from the TickNumber and the time when the measurement was started. 数据不带有时间戳,仅存储“ TickNumber”,然后根据TickNumber和测量开始的时间来计算时间。

I have no control over this. 我对此无能为力。

However, to speed things up, I download the data, analyze it and store it in another database with an asp.net frontend. 但是,为了加快处理速度,我下载了数据,对其进行分析,然后将其存储在具有asp.net前端的另一个数据库中。 This system works great and everybody's happy. 这个系统运作良好,每个人都很高兴。

However, the analysis part is taking forever, and after spending some time with the Performance Analyzer I've found the problem. 但是,分析部分需要花费很多时间,在使用Performance Analyzer一段时间后,我发现了问题所在。

This fetches the data and returns a List. 这将获取数据并返回一个列表。

Public Shared Function GetMeasuredValues(ByVal _startdate As Date, ByVal _enddate As Date) As List(Of MeasuredValues)
    Dim _db As New Quickview

    Dim functions() As Integer = System.Enum.GetValues(GetType(Enums.MeasuredValueTypes))
    Dim total_values As New List(Of MeasuredValues)

    'Finding max and min row values
    Dim stations() As Integer = {1, 2, 3, 4, 6, 16}
    For Each i In stations
        Dim station As Integer = i
        Dim local_start As Integer = DB.DateToPeriodNo(station, _startdate)
        Dim local_end As Integer = DB.DateToPeriodNo(station, _enddate)

        If local_start > 0 Then
            Dim all_values = (From vls In _db.MeasuredValues
                              Where vls.MeasValueId = station _
                              And functions.Contains(vls.FuncId) _
                              And vls.PeriodNo >= local_start And vls.PeriodNo <= local_end _
                              ).ToList
            Console.WriteLine("Data count for station " & i & ": " & all_values.Count)
            total_values.AddRange(all_values)
        End If
    Next

    Dim sorted_values = (From vls In total_values
                         Order By vls.Time Ascending, vls.MeasValueId Ascending).ToList

    Return sorted_values
End Function

This works OK. 这样就可以了。 There's a lot of data, and transferring the data is taking up most of the consumed time at this step. 有很多数据,在此步骤中,传输数据占用了大部分消耗时间。

This data is then filtered to give me values from one hour (07:00 to 07:59, etc). 然后过滤此数据,以提供一小时(07:00至07:59等)的值。 I use those values to calculate the averages and sums needed for that hour. 我使用这些值来计算该小时所需的平均值和总和。 Sadly much of this is logarithmic so I can't use. 可悲的是,其中大部分是对数的,因此我无法使用。 Sum, etc. 总和等

Then I do: 然后我做:

Dim all_values = DB.GetMeasuredValues(date_start, date_end)

.... which just gives me a list of all the values I need. ....这只是列出所有我需要的值。

Here's the problem, this query seems to take forever. 这就是问题,此查询似乎要花很长时间。

''' [in for-loop going through each hour between date_start and date_end, typically 24 hours]
Dim values_hour = (From vls In all_values
                   Where vls.MeasValueId = station _
                   And vls.FuncId = Func _
                   And vls.Time >= time_start And vls.Time < time_end).ToList

If I am to trust the Performance Analyzer, this simple query takes 97% of the resources. 如果我信任Performance Analyzer,那么此简单查询将占用97%的资源。 My calculations don't seem to have any impact (<0.2%) at all. 我的计算似乎没有任何影响(<0.2%)。

I'm sure I'm doing something wrong, but what? 我确定我做错了什么,但是呢?

The call to ToList is what will cause a lot of overhead here because it needs to allocate the memory for the list every time, then fill that list. 调用ToList会引起很多开销,因为它每次都需要为列表分配内存,然后填充该列表。
Have you tried removing that? 您是否尝试过删除它?
On top of that, I would use total_values.Concat(all_values) instead of total_values.AddRange(all_values) and then just call the ToList at the very end like you already do. 最重要的是,我将使用total_values.Concat(all_values)而不是total_values.AddRange(all_values) ,然后像您已经做的那样在最后调用ToList

And I would only write the data count if in debug mode or something similar so that you don't loose time there either when performance is important. 而且,我只会在调试模式或类似模式下写入数据计数,以便在性能很重要时也不会浪费时间。

As you have the data sorted by time you could use SkipWhile/TakeWhile to get a time chunk out of them and then apply the other filters. 当您按时间对数据进行排序时,可以使用SkipWhile/TakeWhile从中获取时间块,然后应用其他过滤器。 Thus you enumerate the bulk data once to get the required times, and only apply the filters to this subset of the data: 因此,您一次枚举批量数据以获得所需的时间,并且仅将过滤器应用于该数据子集:

Dim slice = all_values _
            .SkipWhile(Function(vls) vls.Time < time_start) _
            .TakeWhile(Function(vls) vls.Time < time_end)

and then filter by Func and station . 然后按Funcstation进行过滤。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM