简体   繁体   English

实体框架查询缓存

[英]Entity Framework query caching

This MSDN article lists a whole bunch of ways to improve your Entity Framework performance: 这篇MSDN文章列出了一大堆提高Entity Framework性能的方法:

https://msdn.microsoft.com/en-us/data/hh949853.aspx https://msdn.microsoft.com/en-us/data/hh949853.aspx

One of its suggestions (4.3) is to convert the properties of a non-mapped object into a local variable, so that EF can cache its internal query plan. 其中一个建议(4.3)是将非映射对象的属性转换为局部变量,以便EF可以缓存其内部查询计划。

Sounds like a great idea. 这主意听起来很不错。 So I put it to the test with a simple query that compared the performance over 10,000 iterations of an indirect property reference in a query to a local variable. 因此,我使用一个简单的查询进行测试,该查询将查询中间接属性引用的10,000次迭代的性能与局部变量进行比较。 Like so: 像这样:

[Fact]
public void TestQueryCaching()
{
    const int iterations = 1000;

    var quote = new Quote();
    using (var ctx = new CoreContext())
    {
        quote.QuoteId = ctx.Quotes.First().Id;
    }

    double indirect = 0;
    double direct = 0;

    10.Times(it =>
    {
        indirect += PerformCoreDbTest(iterations, "IndirectValue", (ctx, i) =>
           {
               var dbQuote = ctx.Quotes.First(x => x.Id == quote.QuoteId);
           }).TotalSeconds;
        direct += PerformCoreDbTest(iterations, "DirectValue", (ctx, i) =>
            {
                var quoteId = quote.QuoteId;
                var dbQuote = ctx.Quotes.First(x => x.Id == quoteId);
            }).TotalSeconds;
    });

    _logger.Debug($"Indirect seconds: {indirect:0.00}, direct seconds:{direct:0.00}");
}

protected TimeSpan PerformCoreDbTest(int iterations, string descriptor, Action<ICoreContext, int> testAction)
{
    var sw = new Stopwatch();
    sw.Start();
    for (var i = 0; i < iterations; i++)
    {
        using (var ctx = new CoreContext())
        {
            testAction(ctx, i);
        }
    }
    sw.Stop();
    _logger.DebugFormat("{0}: Took {1} milliseconds for {2} iterations",
        descriptor, sw.Elapsed.TotalMilliseconds, iterations);
    return sw.Elapsed;
}

But I'm not seeing any real performance benefit. 但我没有看到任何真正的性能优势。 On two different machines, these are the results over 5 iterations: 在两台不同的机器上,这些是5次迭代的结果:

Machine1 - Indirect seconds: 9.06, direct seconds:9.36
Machine1 - Indirect seconds: 9.98, direct seconds:9.84
Machine2 - Indirect seconds: 22.41, direct seconds:20.38
Machine2 - Indirect seconds: 17.27, direct seconds:16.93
Machine2 - Indirect seconds: 16.35, direct seconds:16.32

Using a local variable - the "direct" approach that the MSDN article recommends - is maybe the tiniest bit faster (4/5 times), but not consistently, and not really by much. 使用本地变量 - MSDN文章推荐的“直接”方法 - 可能是最快的(4/5次),但不是一致的,而且实际上并不是很多。

Am I doing something wrong in my testing? 我在测试中做错了吗? Or is the effect really so slight that it doesn't make much difference? 或者效果真的很轻微,它没有太大的区别? Or is the MSDN article basically wrong, and that way of referring to objects doesn't make any difference to query caching? 或者MSDN文章基本上是错误的,这种引用对象的方式对查询缓存没有任何影响?

** Edits 10/9/16 ** I modified the query to (a) make it more complex, and (b) to pass in a different quoteId each time. **编辑10/9/16 **我将查询修改为(a)使其更复杂,以及(b)每次传递不同的quoteId。 I suspect the latter is important, as otherwise the query does in fact get cached - since there aren't any parameters. 我怀疑后者很重要,否则查询确实会被缓存 - 因为没有任何参数。 See the answer from @raderick below. 请参阅下面@raderick的答案。

Here's the more complex test: 这是更复杂的测试:

[Fact]
public void TestQueryCaching()
{
    const int iterations = 1000;

    List<EFQuote> quotes;
    using (var ctx = new CoreContext())
    {
        quotes = ctx.Quotes.Take(iterations).ToList();
    }

    double indirect = 0;
    double direct = 0;
    double iqueryable = 0;

    10.Times(it =>
    {
        indirect += PerformCoreDbTest(iterations, "IndirectValue", (ctx, i) =>
        {
            var quote = quotes[i];
            var dbQuote = ctx.Quotes
             .Include(x => x.QuoteGroup.QuoteGroupElements.Select(e => e.DefaultElement.DefaultChoices))
             .Include(x => x.QuoteElements.Select(e => e.DefaultElement.DefaultChoices))
             .Include(x => x.QuotePackage)
             .Include(x => x.QuoteDefinition)
             .Include(x => x.QuoteLines)
             .First(x => x.Id == quote.Id);
        }).TotalSeconds;
        direct += PerformCoreDbTest(iterations, "DirectValue", (ctx, i) =>
        {
            var quoteId = quotes[i].Id;
            var dbQuote = ctx.Quotes
                .Include(x => x.QuoteGroup.QuoteGroupElements.Select(e => e.DefaultElement.DefaultChoices))
                .Include(x => x.QuoteElements.Select(e => e.DefaultElement.DefaultChoices))
                .Include(x => x.QuotePackage)
                .Include(x => x.QuoteDefinition)
                .Include(x => x.QuoteLines)
                .First(x => x.Id == quoteId);
        }).TotalSeconds;
        iqueryable += PerformCoreDbTest(iterations, "IQueryable", (ctx, i) =>
        {
            var quoteId = quotes[i].Id;
            var dbQuote = ctx.Quotes
                    .Include(x => x.QuoteGroup.QuoteGroupElements.Select(e => e.DefaultElement.DefaultChoices))
                    .Include(x => x.QuoteElements.Select(e => e.DefaultElement.DefaultChoices))
                    .Include(x => x.QuotePackage)
                    .Include(x => x.QuoteDefinition)
                    .Include(x => x.QuoteLines)
                    .Where(x => x.Id == quoteId).First();
        }).TotalSeconds;
    });

    _logger.Debug($"Indirect seconds: {indirect:0.00}, direct seconds:{direct:0.00}, iqueryable seconds:{iqueryable:0.00}");
}

And the results (over 10,000 total iterations) are much more like what the MSDN article above describes: 结果(超过10,000次迭代)更像上面的MSDN文章描述的内容:

Indirect seconds: 141.32, direct seconds:91.95, iqueryable seconds:93.96

I am not 100% sure that this article can describe current behavior as for Entity Framework version 6, but this thing should be related to query compilation in Entity Framework into stored procedures. 我不是100%确定本文可以描述Entity Framework版本6的当前行为,但是这个东西应该与Entity Framework中的查询编译相关联到存储过程。

When you first call some query using Entity Framework, it has to be compiled by EF into an SQL statement - either a pure SELECT query, or a procedure using exec and parameters for it, example: 当您第一次使用Entity Framework调用某个查询时,必须将它编译为一个SQL语句 - 一个纯SELECT查询,或一个使用exec和参数的过程,例如:

exec sp_executesql N'SELECT 
    [Extent1].[Id] AS [Id], 
    [Extent1].[Name] AS [Name], 
    [Extent1].[IssuedAt] AS [IssuedAt], 
    [Extent1].[Status] AS [Status], 
    [Extent1].[Foo_Id] AS [Foo_Id]
    FROM [dbo].[Activities] AS [Extent1]
    WHERE (N''Some Name'' = [Extent1].[Name]) AND ([Extent1].[Id] = @p__linq__0)',N'@p__linq__0 int',@p__linq__0=0

@p__linq__0 is a parameter in the query, so every time you change Id in the query code, Entity Framework will pick this exact same statement from query cache and call it without trying to compile SQL for it again. @p__linq__0是查询中的一个参数,因此每次在查询代码中更改Id时,实体框架都会从查询缓存中选择这个完全相同的语句并调用它而不再尝试为其编译SQL。 On the other hand N''Some Name'' = [Extent1].[Name] part is equal to code x.Name == "Some Name" , I used a constant here so it was transformed not to query parameter, but to simple part of the query statement. 另一方面N''Some Name'' = [Extent1].[Name] part等于代码x.Name == "Some Name" ,我在这里使用了一个常量,所以它不是转换为查询参数,而是查询语句的简单部分。

Each time you try to make a query, Entity Framework checks cache containing complied SQL statements to see if there is an already compiled statement it can re-use with parameters. 每次尝试进行查询时,Entity Framework都会检查包含已编译SQL语句的缓存,以查看是否存在可以与参数一起重用的已编译语句。 If that statement is not found, Entity Framework has to compile C# query to Sql again. 如果找不到该语句,Entity Framework必须再次将C#查询编译为Sql。 So if you have your queries small and fast-compiled, you won't notice anything, but if you have "hard-to-compile" queries with a lot of includes, conditions, transformations and built-in function usage, you can hit heavy penalties when your queries don't hit Entity Framework compiled queries cache. 因此,如果您的查询很小并且编译速度很快,您将不会注意到任何内容,但如果您有大量包含,条件,转换和内置函数使用的“难以编译”查询,那么您可以点击当您的查询没有点击Entity Framework编译的查询缓存时会受到重罚。

You can see some similarity here with current work of paging without using overloads for Skip and Take , not hitting compiled query cache when changing page: Force Entity Framework to use SQL parameterization for better SQL proc cache reuse 你可以看到当前的分页工作有一些相似之处而不使用SkipTake重载,而不是在更改页面时遇到编译的查询缓存: 强制实体框架使用SQL参数化来更好地重用SQL proc缓存

You can face this effect when using constants in your code, and its effect is quite non-obvious. 在代码中使用常量时可以面对这种效果,其效果非常明显。 Let's compare these code pieces and SQL that EntityFramework produces (I omitted class definitions for brevity, should be pretty obvious): 让我们比较这些代码片段和EntityFramework生成的SQL(为了简洁我省略了类定义,应该非常明显):

Query 1 查询1

Sample Code: 示例代码:

var result = context.Activities
                    .Where(x => x.IssuedAt >= DateTime.UtcNow && x.Id == iteration)    
                    .ToList(); 

Produced Sql: 制作的Sql:

exec sp_executesql N'SELECT 
    [Extent1].[Id] AS [Id], 
    [Extent1].[Name] AS [Name], 
    [Extent1].[IssuedAt] AS [IssuedAt], 
    [Extent1].[Status] AS [Status], 
    [Extent1].[Foo_Id] AS [Foo_Id]
    FROM [dbo].[Activities] AS [Extent1]
    WHERE ([Extent1].[IssuedAt] >= (SysUtcDateTime())) AND ([Extent1].[Id] = @p__linq__0)',N'@p__linq__0 int',@p__linq__0=0

You can see that in this case condition x.IssuedAt >= DateTime.UtcNow is transformed to statement [Extent1].[IssuedAt] >= (SysUtcDateTime()) . 您可以看到,在这种情况下,条件x.IssuedAt >= DateTime.UtcNow转换为语句[Extent1].[IssuedAt] >= (SysUtcDateTime())

Query 2 查询2

Sample Code: 示例代码:

var now = DateTime.UtcNow;

var result = context.Activities
                    .Where(x => x.IssuedAt >= now && x.Id == iteration)
                    .ToList();

Produced Sql: 制作的Sql:

exec sp_executesql N'SELECT 
    [Extent1].[Id] AS [Id], 
    [Extent1].[Name] AS [Name], 
    [Extent1].[IssuedAt] AS [IssuedAt], 
    [Extent1].[Status] AS [Status], 
    [Extent1].[Foo_Id] AS [Foo_Id]
    FROM [dbo].[Activities] AS [Extent1]
    WHERE ([Extent1].[IssuedAt] >= @p__linq__0) AND ([Extent1].[Id] = @p__linq__1)',N'@p__linq__0 datetime2(7),@p__linq__1 int',@p__linq__0='2016-10-09 15:27:37.3798971',@p__linq__1=0

In this case, you can see that condition x.IssuedAt >= now was transformed to [Extent1].[IssuedAt] >= @p__linq__0 - a parametrized statement, and DateTime value was passed as procedure argument. 在这种情况下,您可以看到条件x.IssuedAt >= now已转换为[Extent1].[IssuedAt] >= @p__linq__0 - 参数化语句,并且DateTime值作为过程参数传递。

You can clearly can see the difference here with Query 1 - condition was part of the query code without parameter, and it used built-in function for getting date time. 你可以清楚地看到这里与查询1的区别 - 条件是没有参数的查询代码的一部分,它使用内置函数来获取日期时间。

These 2 queries might give you a hint, that usage of constants in Entity Framework produces different queries from using only fields, properties, arguments, etc. It was a bit of synthetic example, let's check something more close to real query. 这两个查询可能会给你一个提示,实体框架中常量的使用会产生不同的查询,只使用字段,属性,参数等。这是一个合成的例子,让我们检查更接近真实查询的东西。

Query 3 查询3

Here I use enum ActivityStatus and want to query for Activity, that has specific Id, and I want to be able to get only activities, that have status "Active" (whatever that means). 在这里,我使用enum ActivityStatus并想要查询具有特定Id的Activity,并且我希望能够仅获取状态为“Active”的活动(无论这意味着什么)。

Sample Code: 示例代码:

var result = context.Activities
    .Where(x => x.Status == ActivityStatus.Active 
                && x.Id == id)
    .ToList();

Produced Sql: 制作的Sql:

exec sp_executesql N'SELECT 
    [Extent1].[Id] AS [Id], 
    [Extent1].[Name] AS [Name], 
    [Extent1].[IssuedAt] AS [IssuedAt], 
    [Extent1].[Status] AS [Status], 
    [Extent1].[Foo_Id] AS [Foo_Id]
    FROM [dbo].[Activities] AS [Extent1]
    WHERE (0 = [Extent1].[Status]) AND ([Extent1].[Id] = @p__linq__0)',N'@p__linq__0 int',@p__linq__0=0

You can see, that usage of constant in condition x.Status == ActivityStatus.Active produces SQL 0 = [Extent1].[Status] , which is correct. 你可以看到,在条件x.Status == ActivityStatus.Active中使用常量会产生SQL 0 = [Extent1].[Status] ,这是正确的。 Status here is not parametrized, so if you call same query somewhere else using condition x.Status = ActivityStatus.Pending , that will produce another query, so calling it for the first time will cause Entity Framework query compilation. 这里的状态不是参数化的,所以如果你使用条件x.Status = ActivityStatus.Pending其他地方调用相同的查询,那将产生另一个查询,所以第一次调用它将导致实体框架查询编译。 You can avoid it using Query 4 for both. 您可以使用查询4来避免它。

Query 4 查询4

Sample Code: 示例代码:

var status = ActivityStatus.Active;

var result = context.Activities
                    .Where(x => x.Status == status
                                && x.Id == iteration)
                    .ToList();

Produced Sql: 制作的Sql:

exec sp_executesql N'SELECT 
    [Extent1].[Id] AS [Id], 
    [Extent1].[Name] AS [Name], 
    [Extent1].[IssuedAt] AS [IssuedAt], 
    [Extent1].[Status] AS [Status], 
    [Extent1].[Foo_Id] AS [Foo_Id]
    FROM [dbo].[Activities] AS [Extent1]
    WHERE ([Extent1].[Status] = @p__linq__0) AND ([Extent1].[Id] = @p__linq__1)',N'@p__linq__0 int,@p__linq__1 int',@p__linq__0=0,@p__linq__1=0

As you can see, this query statement is fully parametrized, so changing status to Pending, Active, Inactive, etc. will still use the same query from the compiled queries cache. 如您所见,此查询语句是完全参数化的,因此将状态更改为Pending,Active,Inactive等仍将使用编译查询缓存中的相同查询。

Depending on your coding style, you might face this issue from time to time, when same 2 queries that have only different constant value will compile a query each. 根据您的编码风格,您可能会不时遇到此问题,当只有2个具有不同常量值的查询时,每个查询都会编译一个查询。 I can offer you to try same query with using booleans as constants, it should produce same result - having condition unparametrized. 我可以为你提供尝试使用布尔值作为常量的相同查询,它应该产生相同的结果 - 条件未参数化。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM