简体   繁体   English

是什么让 Enum.HasFlag 如此缓慢?

[英]What is it that makes Enum.HasFlag so slow?

I was doing some speed tests and I noticed that Enum.HasFlag is about 16 times slower than using the bitwise operation.我正在做一些速度测试,我注意到 Enum.HasFlag 比使用按位运算慢大约 16 倍。

Does anyone know the internals of Enum.HasFlag and why it is so slow?有谁知道 Enum.HasFlag 的内部结构以及为什么它这么慢? I mean twice as slow wouldn't be too bad but it makes the function unusable when its 16 times slower.我的意思是慢两倍不会太糟糕,但是当它慢 16 倍时,它会使功能无法使用。

In case anyone is wondering, here is the code I am using to test its speed.如果有人想知道,这是我用来测试其速度的代码。

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;

namespace app
{
    public class Program
    {
        [Flags]
        public enum Test
        {
            Flag1 = 1,
            Flag2 = 2,
            Flag3 = 4,
            Flag4 = 8
        }
        static int num = 0;
        static Random rand;
        static void Main(string[] args)
        {
            int seed = (int)DateTime.UtcNow.Ticks;

            var st1 = new SpeedTest(delegate
            {
                Test t = Test.Flag1;
                t |= (Test)rand.Next(1, 9);
                if (t.HasFlag(Test.Flag4))
                    num++;
            });

            var st2 = new SpeedTest(delegate
            {
                Test t = Test.Flag1;
                t |= (Test)rand.Next(1, 9);
                if (HasFlag(t , Test.Flag4))
                    num++;
            });

            rand = new Random(seed);
            st1.Test();
            rand = new Random(seed);
            st2.Test();

            Console.WriteLine("Random to prevent optimizing out things {0}", num);
            Console.WriteLine("HasFlag: {0}ms {1}ms {2}ms", st1.Min, st1.Average, st1.Max);
            Console.WriteLine("Bitwise: {0}ms {1}ms {2}ms", st2.Min, st2.Average, st2.Max);
            Console.ReadLine();
        }
        static bool HasFlag(Test flags, Test flag)
        {
            return (flags & flag) != 0;
        }
    }
    [DebuggerDisplay("Average = {Average}")]
    class SpeedTest
    {
        public int Iterations { get; set; }

        public int Times { get; set; }

        public List<Stopwatch> Watches { get; set; }

        public Action Function { get; set; }

        public long Min { get { return Watches.Min(s => s.ElapsedMilliseconds); } }

        public long Max { get { return Watches.Max(s => s.ElapsedMilliseconds); } }

        public double Average { get { return Watches.Average(s => s.ElapsedMilliseconds); } }

        public SpeedTest(Action func)
        {
            Times = 10;
            Iterations = 100000;
            Function = func;
            Watches = new List<Stopwatch>();
        }

        public void Test()
        {
            Watches.Clear();
            for (int i = 0; i < Times; i++)
            {
                var sw = Stopwatch.StartNew();
                for (int o = 0; o < Iterations; o++)
                {
                    Function();
                }
                sw.Stop();
                Watches.Add(sw);
            }
        }
    }
}

Results:结果:

HasFlag: 52ms 53.6ms 55ms
Bitwise: 3ms 3ms 3ms

Does anyone know the internals of Enum.HasFlag and why it is so slow?有谁知道 Enum.HasFlag 的内部结构以及为什么它这么慢?

The actual check is just a simple bit check in Enum.HasFlag - it's not the problem here.实际检查只是Enum.HasFlag一个简单位检查 - 这不是这里的问题。 That being said, it is slower than your own bit check...话虽如此,它比您自己的位检查慢...

There are a couple of reasons for this slowdown:这种放缓有几个原因:

First, Enum.HasFlag does an explicit check to make sure that the type of the enum and the type of the flag are both the same type, and from the same Enum.首先, Enum.HasFlag做一个显式检查以确保枚举的类型和标志的类型都是相同的类型,并且来自同一个枚举。 There is some cost in this check.这张支票有一些费用。

Secondly, there is an unfortunate box and unbox of the value during a conversion to UInt64 that occurs inside of HasFlag .其次,在 HasFlag 内部发生的转换为UInt64过程中,有一个不幸的框和值的HasFlag This is, I believe, due to the requirement that Enum.HasFlag work with all enums, regardless of the underlying storage type.我相信这是由于Enum.HasFlag与所有枚举Enum.HasFlag工作的要求,无论底层存储类型如何。

That being said, there is a huge advantage to Enum.HasFlag - it's reliable, clean, and makes the code very obvious and expressive.话虽如此, Enum.HasFlag有一个巨大的优势——它可靠、干净,并使代码非常明显和富有表现力。 For the most part, I feel that this makes it worth the cost - but if you're using this in a very performance critical loop, it may be worth doing your own check.在大多数情况下,我认为这值得付出代价 - 但如果您在非常关键的性能循环中使用它,则可能值得自己进行检查。

Decompiled code of Enum.HasFlags() looks like this: Enum.HasFlags()反编译代码如下所示:

public bool HasFlag(Enum flag)
{
    if (!base.GetType().IsEquivalentTo(flag.GetType()))
    {
        throw new ArgumentException(Environment.GetResourceString("Argument_EnumTypeDoesNotMatch", new object[] { flag.GetType(), base.GetType() }));
    }
    ulong num = ToUInt64(flag.GetValue());
    return ((ToUInt64(this.GetValue()) & num) == num);
}

If I were to guess, I would say that checking the type was what's slowing it down most.如果我猜的话,我会说检查类型是最减慢速度的原因。

Note that in recent versions of .Net Core, this has been improved and Enum.HasFlag compiles to the same code as using bitwise comparisons.请注意,在 .Net Core 的最新版本中,这已得到改进,并且Enum.HasFlag编译为与使用按位比较相同的代码。

The performance penalty due to boxing discussed on this page also affects the public .NET functions Enum.GetValues and Enum.GetNames , which both forward to (Runtime)Type.GetEnumValues and (Runtime)Type.GetEnumNames respectively.由于本页讨论的装箱而导致的性能损失也会影响公共.NET函数Enum.GetValuesEnum.GetNames ,它们分别转发到(Runtime)Type.GetEnumValues(Runtime)Type.GetEnumNames

All of these functions use a (non-generic) Array as a return type--which is not so bad for the names (since String is a reference type)--but is quite inappropriate for the ulong[] values.所有这些函数都使用(非泛型) Array作为返回类型——这对名称来说还不错(因为String是引用类型)——但对于ulong[]值来说非常不合适。

Here's a peek at the offending code (.NET 4.7):下面是有问题的代码(.NET 4.7):

public override Array /* RuntimeType.*/ GetEnumValues()
{
    if (!this.IsEnum)
        throw new ArgumentException();

    ulong[] values = Enum.InternalGetValues(this);
    Array array = Array.UnsafeCreateInstance(this, values.Length);
    for (int i = 0; i < values.Length; i++)
    {
        var obj = Enum.ToObject(this, values[i]);   // ew. boxing.
        array.SetValue(obj, i);                     // yuck
    }
    return array;              // Array of object references, bleh.
}

We can see that prior to doing the copy, RuntimeType goes back again to System.Enum to get an internal array, a singleton which is cached, on demand, for each specific Enum .我们可以看到,在进行复制之前, RuntimeType再次返回System.Enum以获取一个内部数组,一个根据需要为每个特定Enum缓存的单例。 Notice also that this version of the values array does use the proper strong signature, ulong[] .另请注意,版本的 values 数组确实使用了适当的强签名ulong[]

Here's the .NET function (again we're back in System.Enum now).这是 .NET 函数(我们现在又回到System.Enum )。 There's a similar function for getting the names (not shown).有一个类似的函数来获取名称(未显示)。

internal static ulong[] InternalGetValues(RuntimeType enumType) => 
    GetCachedValuesAndNames(enumType, false).Values;

See the return type?看到返回类型了吗? This looks like a function we'd like to use... But first consider that a second reason that .NET re-copys the array each time (as you saw above) is that .NET must ensure that each caller gets an unaltered copy of the original data, given that a malevolent coder could change her copy of the returned Array , introducing a persistent corruption.这看起来像一个我们想要使用的函数......但首先考虑 .NET 每次重新复制数组的第二个原因(如您所见)是 .NET 必须确保每个调用者获得一个未更改的副本原始数据的一部分,因为恶意的编码器可以更改她返回的Array副本,从而引入持续损坏。 Thus, the re-copying precaution is especially intended to protect the cached internal master copy.因此,重新复制预防措施特别旨在保护缓存的内部主副本。

If you aren't worried about that risk, perhaps because you feel confident you won't accidentally change the array, or maybe just to eke-out a few cycles of (what's surely premature) optimization, it's simple to fetch the internal cached array copy of the names or values for any Enum :如果您不担心这种风险,也许是因为您有信心不会意外更改数组,或者只是为了维持几个(肯定是过早的)优化周期,那么获取内部缓存数组很简单任何Enum的名称或值的副本:

→ The following two functions comprise the sum contribution of this article ← → 以下两个函数构成本文的总和贡献←
→ (but see edit below for improved version) ← →(但请参阅下面的编辑以获取改进版本)←

static ulong[] GetEnumValues<T>() where T : struct =>
        (ulong[])typeof(System.Enum)
            .GetMethod("InternalGetValues", BindingFlags.Static | BindingFlags.NonPublic)
            .Invoke(null, new[] { typeof(T) });

static String[] GetEnumNames<T>() where T : struct =>
        (String[])typeof(System.Enum)
            .GetMethod("InternalGetNames", BindingFlags.Static | BindingFlags.NonPublic)
            .Invoke(null, new[] { typeof(T) });

Note that the generic constraint on T isn't fully sufficient for guaranteeing Enum .请注意, T上的通用约束不足以保证Enum For simplicity, I left off checking any further beyond struct , but you might want to improve on that.为简单起见,我不再检查struct之外的任何内容,但您可能希望对此进行改进。 Also for simplicity, this (ref-fetches and) reflects directly off the MethodInfo every time rather than trying to build and cache a Delegate .同样为简单起见,这(ref-fetches and)每次都直接从MethodInfo反映,而不是尝试构建和缓存Delegate The reason for this is that creating the proper delegate with a first argument of non-public type RuntimeType is tedious.这样做的原因是使用非公共类型RuntimeType的第一个参数创建适当的委托是乏味的。 A bit more on this below.下面再详细介绍一下。

First, I'll wrap up with usage examples:首先,我将总结使用示例:

var values = GetEnumValues<DayOfWeek>();
var names = GetEnumNames<DayOfWeek>();

and debugger results:和调试器结果:

'values'    ulong[7]
[0] 0
[1] 1
[2] 2
[3] 3
[4] 4
[5] 5
[6] 6

'names' string[7]
[0] "Sunday"
[1] "Monday"
[2] "Tuesday"
[3] "Wednesday"
[4] "Thursday"
[5] "Friday"
[6] "Saturday"

So I mentioned that the "first argument" of Func<RuntimeType,ulong[]> is annoying to reflect over.所以我提到Func<RuntimeType,ulong[]>的“第一个参数”令人讨厌反思。 However, because this "problem" arg happens to be first, there's a cute workaround where you can bind each specific Enum type as a Target of its own delegate, where each is then reduced to Func<ulong[]> .)但是,因为这个“问题” arg 恰好是第一个,所以有一个可爱的解决方法,您可以将每个特定的Enum类型绑定为它自己的委托的Target ,然后每个类型简化为Func<ulong[]> 。)

Clearly, its pointless to make any of those delegates, since each would just be a function that always return the same value... but the same logic seems to apply, perhaps less obviously, to the original situation as well (ie, Func<RuntimeType,ulong[]> ).显然,让这些委托中的任何一个都毫无意义,因为每个委托都只是一个总是返回相同值的函数……但相同的逻辑似乎也适用于原始情况,也许不太明显(即Func<RuntimeType,ulong[]> )。 Although we do get by with a just one delegate here, you'd never really want to call it more than once per Enum type .尽管我们在这里只使用了一个委托,但您永远不会真的想为每个 Enum 类型调用一次以上 Anyway, all of this leads to a much better solution, which is included in the edit below.无论如何,所有这些都导致了一个更好的解决方案,它包含在下面的编辑中。


[edit:] [编辑:]
Here's a slightly more elegant version of the same thing.这是同一事物的稍微优雅的版本。 If you will be calling the functions repeatedly for the same Enum type, the version shown here will only use reflection one time per Enum type.如果您将重复调用相同Enum类型的函数,则此处显示的版本将仅对每个 Enum 类型使用反射一次。 It saves the results in a locally-accessible cache for extremely rapid access subsequently.它将结果保存在本地可访问的缓存中,以便随后快速访问。

static class enum_info_cache<T> where T : struct
{
    static _enum_info_cache()
    {
        values = (ulong[])typeof(System.Enum)
            .GetMethod("InternalGetValues", BindingFlags.Static | BindingFlags.NonPublic)
            .Invoke(null, new[] { typeof(T) });

        names = (String[])typeof(System.Enum)
            .GetMethod("InternalGetNames", BindingFlags.Static | BindingFlags.NonPublic)
            .Invoke(null, new[] { typeof(T) });
    }
    public static readonly ulong[] values;
    public static readonly String[] names;
};

The two functions become trivial:这两个函数变得微不足道:

static ulong[] GetEnumValues<T>() where T : struct => enum_info_cache<T>.values;
static String[] GetEnumNames<T>() where T : struct => enum_info_cache<T>.names;

The code shown here illustrates a pattern of combining three specific tricks that seem to mutually result in an unusualy elegant lazy caching scheme.此处显示的代码说明了一种组合三个特定技巧的模式,这些技巧似乎相互产生了异常优雅的惰性缓存方案。 I've found the particular technique to have surprisingly wide application.我发现这种特殊的技术有着惊人的广泛应用。

  1. using a generic static class to cache independent copies of the arrays for each distinct Enum .使用通用静态类为每个不同的Enum缓存数组的独立副本。 Notably, this happens automatically and on demand;值得注意的是,这是自动和按需发生的;

  2. related to this, the loader lock guarantees unique atomic initialization and does this without the clutter of conditional checking constructs.与此相关的是, 加载器锁保证了唯一的原子初始化,并且不会出现条件检查结构的混乱。 We can also protect static fields with readonly (which, for obvious reasons, typically can't be used with other lazy/deferred/demand methods);我们还可以使用readonly保护静态字段(出于显而易见的原因,通常不能与其他惰性/延迟/需求方法一起使用);

  3. finally, we can capitalize on C# type inference to automatically map the generic function (entry point) into its respective generic static class , so that the demand caching is ultimately even driven implicitly ( viz. , the best code is the code that isn't there--since it can never have bugs)最后,我们可以利用 C#类型推断将泛型函数(入口点)自动映射到其各自的泛型静态类中,这样需求缓存最终甚至是隐式驱动的(,最好的代码是不是那里——因为它永远不会有错误)

You probably noticed that the particular example shown here doesn't really illustrate point (3) very well.您可能注意到这里显示的特定示例并没有很好地说明第 (3) 点。 Rather than relying on type inference, the void -taking function has to manually propagate forward the type argument T . void -taking 函数必须手动向前传播类型参数T ,而不是依赖于类型推断。 I didn't choose to expose these simple functions such that there would be an opportunity to show how C# type inference makes the overall technique shine...我没有选择公开这些简单的函数,以便有机会展示 C# 类型推断如何使整体技术大放异彩……

However, you can imagine that when you do combine a static generic function that can infer its type argument(s)--ie, so you don't even have to provide them at the call site--then it gets quite powerful.但是,您可以想象,当您确实组合了一个可以推断其类型参数的静态泛型函数时——即,因此您甚至不必在调用站点提供它们——那么它就会变得非常强大。

The key insight is that, while generic functions have the full type-inference capability, generic classes do not, that is, the compiler will never infer T if you try to call the first of the following lines.关键的见解是,虽然泛型函数具有完整的类型推断能力,但泛型没有,也就是说,如果您尝试调用以下行中的第一行,编译器将永远不会推断出T But we can still get fully inferred access to a generic class, and all the benefits that entails, by traversing into them via generic function implicit typing (last line):但是我们仍然可以通过泛型函数隐式类型(最后一行)遍历它们,完全推断出对泛型类的访问,以及它带来的所有好处:

int t = 4;
typed_cache<int>.MyTypedCachedFunc(t);  // no inference from 't', explicit type required

MyTypedCacheFunc<int>(t);               // ok, (but redundant)

MyTypedCacheFunc(t);                    // ok, full inference

Designed well, inferred typing can effortlessly launch you into the appropriate automatically demand-cached data and behaviors, customized for each type (recall points 1. and 2).设计良好,推断类型可以毫不费力地让您进入适当的自动需求缓存数据和行为,为每种类型定制(回忆点 1. 和 2)。 As noted, I find the approach useful, especially considering its simplicity.如前所述,我发现该方法很有用,尤其是考虑到它的简单性。

The JITter ought to be inlining this as a simple bitwise operation. JITter 应该将其内联为一个简单的按位运算。 The JITter is aware enough to custom-handle even certain framework methods (via MethodImplOptions.InternalCall I think?) but HasFlag seems to have escaped Microsoft's serious attention. JITter 足够了解甚至可以自定义处理某些框架方法(我认为是通过 MethodImplOptions.InternalCall 吗?)但 HasFlag 似乎已经逃脱了 Microsoft 的认真关注。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM