简体   繁体   中英

How to make LINQ to EF query more efficient

I have the following query that finds price reductions for specific listings in our real estate database. I want to know if this query can be done more efficiently. The end goal is to get the last 2 price changes for listings where the most recent price change is less than the one before it. Here is my query:

var PriceDrops = idxContext.ListingPriceChanges
     .Where(a => a.DateAdded >= LastRunTime && ListingIDsChunk.Contains(a.ListingID))
     .GroupBy(g => g.ListingID)
     .Where(g => g.Count() > 1 && g.OrderByDescending(a => a.DateAdded).FirstOrDefault().ListPrice < g.OrderByDescending(a => a.DateAdded).Skip(1).FirstOrDefault().ListPrice)
     .SelectMany(g => g.OrderByDescending(a => a.DateAdded).Take(2)).ToList();

This query works, I just want to know if it can be done more efficiently.

Here is some more info as requested:

ListingIDChunk size is 2000 id's at a time

The time it takes to execute: 20.4172256 seconds (based off stopwatch)

The SQL it generates is:

SELECT 
[Limit3].[ListingPriceChangeID] AS [ListingPriceChangeID], 
[Limit3].[ListingID] AS [ListingID], 
[Limit3].[ListPrice] AS [ListPrice], 
[Limit3].[PriceChangeDate] AS [PriceChangeDate], 
[Limit3].[DateAdded] AS [DateAdded]
FROM   (SELECT [Project7].[ListingID] AS [ListingID]
    FROM ( SELECT 
        [Project6].[ListingID] AS [ListingID], 
        [Project6].[ListPrice] AS [ListPrice], 
        [Project6].[ListPrice1] AS [ListPrice1], 
        (SELECT 
            COUNT(1) AS [A1]
            FROM [dbo].[ListingPriceChanges] AS [Extent4]
            WHERE ([Extent4].[DateAdded] >= @p__linq__0) AND ([Extent4].[ListingID] IN (REMOVED-2000-IDs)) AND ([Project6].[ListingID] = [Extent4].[ListingID])) AS [C1]
        FROM ( SELECT 
            [Project4].[ListingID] AS [ListingID], 
            [Project4].[ListPrice] AS [ListPrice], 
            [Limit2].[ListPrice] AS [ListPrice1]
            FROM   (SELECT 
                [Project2].[ListingID] AS [ListingID], 
                [Limit1].[ListPrice] AS [ListPrice]
                FROM   (SELECT 
                    @p__linq__0 AS [p__linq__0], 
                    [Distinct1].[ListingID] AS [ListingID]
                    FROM ( SELECT DISTINCT 
                        [Extent1].[ListingID] AS [ListingID]
                        FROM [dbo].[ListingPriceChanges] AS [Extent1]
                        WHERE ([Extent1].[DateAdded] >= @p__linq__0) AND ([Extent1].[ListingID] IN (REMOVED-2000-IDs))
                    )  AS [Distinct1] ) AS [Project2]
                OUTER APPLY  (SELECT TOP (1) [Project3].[ListPrice] AS [ListPrice]
                    FROM ( SELECT 
                        [Extent2].[ListPrice] AS [ListPrice], 
                        [Extent2].[DateAdded] AS [DateAdded]
                        FROM [dbo].[ListingPriceChanges] AS [Extent2]
                        WHERE ([Extent2].[DateAdded] >= @p__linq__0) AND ([Extent2].[ListingID] IN (REMOVED-2000-IDs)) AND ([Project2].[ListingID] = [Extent2].[ListingID])
                    )  AS [Project3]
                    ORDER BY [Project3].[DateAdded] DESC ) AS [Limit1] ) AS [Project4]
            OUTER APPLY  (SELECT TOP (1) [Project5].[ListPrice] AS [ListPrice], [Project5].[DateAdded] AS [DateAdded]
                FROM ( SELECT [Project5].[ListPrice] AS [ListPrice], [Project5].[DateAdded] AS [DateAdded], row_number() OVER (ORDER BY [Project5].[DateAdded] DESC) AS [row_number]
                    FROM ( SELECT 
                        [Extent3].[ListPrice] AS [ListPrice], 
                        [Extent3].[DateAdded] AS [DateAdded]
                        FROM [dbo].[ListingPriceChanges] AS [Extent3]
                        WHERE ([Extent3].[DateAdded] >= @p__linq__0) AND ([Extent3].[ListingID] IN (REMOVED-2000-IDs)) AND ([Project4].[ListingID] = [Extent3].[ListingID])
                    )  AS [Project5]
                )  AS [Project5]
                WHERE [Project5].[row_number] > 1
                ORDER BY [Project5].[DateAdded] DESC ) AS [Limit2]
        )  AS [Project6]
    )  AS [Project7]
    WHERE ([Project7].[C1] > 1) AND ([Project7].[ListPrice] < [Project7].[ListPrice1]) ) AS [Filter5]
CROSS APPLY  (SELECT TOP (2) [Project8].[ListingPriceChangeID] AS [ListingPriceChangeID], [Project8].[ListingID] AS [ListingID], [Project8].[ListPrice] AS [ListPrice], [Project8].[PriceChangeDate] AS [PriceChangeDate], [Project8].[DateAdded] AS [DateAdded]
    FROM ( SELECT 
        [Extent5].[ListingPriceChangeID] AS [ListingPriceChangeID], 
        [Extent5].[ListingID] AS [ListingID], 
        [Extent5].[ListPrice] AS [ListPrice], 
        [Extent5].[PriceChangeDate] AS [PriceChangeDate], 
        [Extent5].[DateAdded] AS [DateAdded]
        FROM [dbo].[ListingPriceChanges] AS [Extent5]
        WHERE ([Extent5].[DateAdded] >= @p__linq__0) AND ([Extent5].[ListingID] IN (REMOVED-2000-IDs)) AND ([Filter5].[ListingID] = [Extent5].[ListingID])
    )  AS [Project8]
    ORDER BY [Project8].[DateAdded] DESC ) AS [Limit3]

There are currently 3,239,193 records in the ListingPriceChanges table.

If you need more info, please let me know. In the query above, I replaced the 2000 id's with the text REMOVED-2000-IDs

I'm using EF 5.0 and .NET 4.5

The ListingPriceChanges table is defined as:

[ListingPriceChangeID] [int] IDENTITY(1,1) NOT NULL,
[ListingID] [int] NOT NULL,
[ListPrice] [money] NOT NULL,
[PriceChangeDate] [datetime2](7) NULL,
[DateAdded] [datetime2](7) NOT NULL

The field ListingID is a FK reference to the Listings table. There is also an index for the table on the ListingID field that includes PriceChangeDate .

I can't test at the moment, but I believe you could do the following:

var PriceDrops = idxContext.ListingPriceChanges
    .Where(a => a.DateAdded >= LastRunTime && ListingIDsChunk.Contains(a.ListingID))
    .GroupBy(g => g.ListingID)
    .Where(g => g.Count() > 1)
    .Select(g => g.OrderByDescending(a => a.DateAdded).Take(2))
    .Where(g => g.First().ListPrice < g.Skip(1).First().ListPrice)
    .SelectMany(g => g)
    .ToList();

This should evaluate to fewer subqueries, which I believe may help your overall performance. It's also a bit simpler to follow.

Using a modification to Reed's answer, I was able to drop the average execution time from 20 seconds to 13 seconds on a 2000 listing batch. I used the solution below and dropped the average execution time to around 3 seconds. I needed to call .ToList() after I selected the initial collection but before the ListPrice comparison.

var PriceDrops = idxContext.ListingPriceChanges
        .Where(a => a.DateAdded >= LastRunTime && ListingIDsChunk.Contains(a.ListingID))
    .GroupBy(g => g.ListingID)
    .Where(g => g.Count() > 1)
    .Select(g => g.OrderByDescending(a => a.DateAdded).Take(2))
    .ToList()
    .Where(g => g.First().ListPrice < g.Last().ListPrice)
    .SelectMany(g => g)
    .ToList();

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM