简体   繁体   English

如何实现 C# 通用 class 其方法在结构上是通用的?

[英]How to implement a C# generic class whose methods are generic on structs?

Say we have this pair of structs, part of a widely used interchange format (so we cannot modify the source - and ideally shouldn't want to: we're not trying to alter the data itself):假设我们有这对结构,它是广泛使用的交换格式的一部分(所以我们不能修改源 - 理想情况下不应该这样做:我们不是试图改变数据本身):

struct Vector2 { public int x; public int y; }
struct Vector3 { public int x; public int y; public int z; }

And a class whose core is a list of one or the other, and contains many algorithms which are nearly-identical to implement for either struct, but have to reference the extra members in the 3-element struct (the z ):还有一个 class ,其核心是一个或另一个列表,并包含许多算法,这些算法对于任一结构实现几乎相同,但必须引用 3 元素结构( z )中的额外成员:

public class Mesh<T>
{
    private List<T> _myVectors;
}

...how do you correctly implement the suite of methods that process them? ...您如何正确实现处理它们的方法套件? eg:例如:

public int Average()
{
    // if Vector2:
    int sum = 0, count = 0;
    foreach( var v2 in _MyVectors )
    {
        sum += v2.x + v2.y;
        count++;
    }

    // if Vector3:
    int sum = 0, count = 0;
    foreach( var v3 in _MyVectors )
    {
        sum += v3.x + v3.y + v3.z;
        count++;
    }

    return sum/count;
}

Noting especially : struct-specific methods already exist (millions of them), offered by API's that lack any in-built Generics.特别注意:结构特定的方法已经存在(数百万个),由缺少任何内置 Generics 的 API 提供。 So, eg we can confidently write an algorithm to use a method in FOREIGN_API , knowing that one copy of the source code (but using Generics) will bind to an acceptable implementation either way:因此,例如,我们可以自信地编写一个算法来使用FOREIGN_API中的方法,因为知道源代码的一个副本(但使用泛型)将以任何一种方式绑定到可接受的实现:

public float FOREIGN_API_Average( Vector2 input );
public float FOREIGN_API_Average( Vector3 input );

The problems I'm trying to wrap my head around here are, approximately:我试图在这里解决的问题大约是:

  1. It is the // if Vector2: conceptual part in the example above that I can't figure out how to do in C#.这是上面示例中的// if Vector2:概念部分,我无法弄清楚如何在 C# 中执行。
  2. I'm simply not sure how to structure this.我只是不确定如何构建它。 It feels like I have to do some mildly clever trick around stating "I have arbitrary generics args. But I have some specific versions of this class I will implement internally. (All other non-specific versions are illegal by implication, and I'll implement a set of methods that throw Exceptions)".感觉就像我必须做一些巧妙的技巧来说明“我有任意 generics 参数。但我有这个 class 的一些特定版本,我将在内部实现。(所有其他非特定版本暗示非法,我会实现一组抛出异常的方法)”。 But... I'm not sure how to pull that off.但是...我不知道如何解决这个问题。
  3. Structs cannot "extend" one another.结构不能相互“扩展”。 If they were classes I would start by constraining the using-class (Mesh) with a where T: BaseVector , and go from there.如果它们是类,我将从使用where T: BaseVector和 go 约束使用类(Mesh)开始。 But that's not possible with structs.但这对于结构是不可能的。
  4. The structs come from a (billion dollar) piece of software that I don't own;这些结构来自我不拥有的(十亿美元)软件; there are plenty of architectural decisions I wish they'd made differently, but TL;DR: got to work with what we've got.有很多架构决策我希望他们做出不同的决定,但是 TL;DR:必须使用我们所拥有的。
  5. This problem isn't just two structs: to support 3D math, I have to re-implement everything for Vector1 , Vector2 , Vector3 , Vector4 ... there's a lot of code I don't want to copy/pasta!这个问题不仅仅是两个结构:为了支持 3D 数学,我必须重新实现Vector1Vector2Vector3Vector4的所有内容......我不想复制/粘贴很多代码!
  6. ...and the typical Mesh class has 4 internal lists, each of which can be any of those 4 types. ...而典型的 Mesh class 有 4 个内部列表,每个列表都可以是这 4 种类型中的任何一种。 If I have to write every combination by hand, and not use Generics, I will have 4^4 = 256 copies of the same code to write and maintain.如果我必须手动编写每个组合,而不使用 Generics,我将有4^4 = 256 个相同代码的副本来编写和维护。

Truly makes you envy C++ and their stupid sexy templates, doesn't it?真的让你羡慕 C++ 和他们愚蠢的性感模板,不是吗?

Some assumptions first (correct them if they are wrong):首先进行一些假设(如果错误,请纠正它们):

You've said that the mesh type can have four different lists, so I'll assume its signature is Mesh<T1, T2, T3, T4> .你说过网格类型可以有四个不同的列表,所以我假设它的签名是Mesh<T1, T2, T3, T4> I'm also assuming you control this type, but not the VectorN types.我还假设您控制这种类型,但不是VectorN类型。

The issue is that you're lacking any generic support for Vectors and you cannot use polymorphism on them in any way.问题是您缺乏对 Vectors 的任何通用支持,并且您不能以任何方式对它们使用多态性。 As you've said, wrapping them in an interface or introducing custom classes as wrappers will kill the performance.正如您所说,将它们包装在接口中或引入自定义类作为包装器会降低性能。

So the thing you want to do is a variation on double-dispatch - call a different method based on the type of its arguments.所以你想要做的事情是双重调度的变体 - 根据其 arguments 的类型调用不同的方法。

The simplest thing that comes to mind is a static wrapper for the existing FOREIGN_API calls:想到的最简单的事情是现有FOREIGN_API调用的 static 包装器:

public static class VectorExtensions
{
    public static int Sum<TVector>(this IEnumerable<TVector> vectors)
    {
        var type = typeof(TVector);
        if (type == typeof(Vector1))
        {
            return FOREIGN_API.Sum((IEnumerable<Vector1>)vectors);
        }
        else if (type == typeof(Vector2))
        {
            return FOREIGN_API.Sum((IEnumerable<Vector2>)vectors);
        }
        else if (...) // etc.

        throw new ArgumentException($"Invalid type of vector {typeof(TVector).Name}.");
    }
}

Then, implementing an Average on a mesh is easy (I'm assuming an average is an average of all lists combined):然后,在网格上实现Average很容易(我假设平均值是所有列表组合的平均值):

public class Mesh<T1, T2, T3, T4>
{
    private List<T1> _myVectors1;
    private List<T2> _myVectors2;
    private List<T3> _myVectors3;
    private List<T4> _myVectors4;

    public float Average()
    {
        var sum1 = _myVectors1.Sum();
        var sum2 = _myVectors2.Sum();
        var sum3 = _myVectors3.Sum();
        var sum4 = _myVectors4.Sum();

        return (float)(sum1 + sum2 + sum3 + sum4) / 
            (_myVectors1.Count + _myVectors2.Count + _myVectors3.Count + _myVectors4.Count);
    }
}

This form of typechecking should be fast, as C# heavily optimizes typeof calls.这种形式的类型检查应该很快,因为 C# 极大地优化了typeof调用。

There is a simpler way of writing this that involves dynamic :有一种更简单的写法,涉及到dynamic

public static class VectorExtensions
{
    public static int Sum<TVector>(this IEnumerable<TVector> vectors) =>
        FOREIGN_API.Sum((dynamic)vectors);
}

The dynamic infrastructure is also faster than many expect due to caching, so you might want to give this solution a try first and then think about something else only when the performance is diagnosed to be an issue.由于缓存, dynamic基础架构也比许多人预期的要快,因此您可能希望先尝试此解决方案,然后仅在诊断出性能存在问题时才考虑其他问题。 As you can see this takes a ridiculously small amount of code to try out.正如你所看到的,这需要非常少的代码来尝试。

============================================================================= ==================================================== ============================

Now let's assume we're looking for the most performant way possible.现在让我们假设我们正在寻找性能最好的方法。 I'm pretty convinced that there's no way of entirely avoiding runtime typechecking.我非常确信没有办法完全避免运行时类型检查。 In the above case note, that there are only a handful of typechecks per method invocation.在上面的案例中,每个方法调用只有少数类型检查。 Unless you're calling the Mesh<,,,> methods millions of times, that should be fine.除非您调用Mesh<,,,>方法数百万次,否则应该没问题。 But assuming that you might want to do that, there's a way to trick our way out of this.但是假设您可能想要这样做,有一种方法可以欺骗我们摆脱这种情况。

The idea is to perform all the typechecks required the moment you instantiate a mesh.这个想法是在您实例化网格时执行所有所需的类型检查。 Let us define helper types that we will call VectorOperationsN for all possible N in VectorN types.让我们为VectorN类型中所有可能的N定义我们将调用VectorOperationsN的辅助类型。 It will implement an interface IVectorOperations<TVector> that will define basic vector operations you want to have.它将实现一个接口IVectorOperations<TVector> ,它将定义您想要的基本向量操作。 Let's go with Sum for one or many vectors for now, just as examples:现在让我们用一个或多个向量的Sum go 来作为示例:

public interface IVectorOperations<TVector>
{
    public int Sum(TVector vector);

    public int Sum(IEnumerable<TVector> vectors);
}

public class VectorOperations1 : IVectorOperations<Vector1>
{
    public int Sum(Vector1 vector) => vector.x;

    public int Sum(IEnumerable<Vector1> vectors) => vectors.Sum(v => Sum(v));
}


public class VectorOperations2 : IVectorOperations<Vector2>
{
    public int Sum(Vector2 vector) => vector.x + vector.y;

    public int Sum(IEnumerable<Vector2> vectors) => vectors.Sum(v => Sum(v));
}

Now we need a way to get the appropriate implementation - this will involve the typecheck:现在我们需要一种方法来获得适当的实现——这将涉及类型检查:

public static class VectorOperations
{
    public static IVectorOperations<TVector> GetFor<TVector>()
    {
        var type = typeof(TVector);

        if (type == typeof(Vector1))
        {
            return (IVectorOperations<TVector>)new VectorOperations1();
        }
        else if (...) // etc.

        throw new ArgumentException($"Invalid type of vector {typeof(TVector).Name}.");
    }
}

Now when we create a mesh we will get an appropriate implementation and then use it all throught our methods:现在,当我们创建一个网格时,我们将获得一个适当的实现,然后通过我们的方法使用它:

public class Mesh<T1, T2, T3, T4>
{
    private List<T1> _myVectors1;
    private List<T2> _myVectors2;
    private List<T3> _myVectors3;
    private List<T4> _myVectors4;
    private readonly IVectorOperations<T1> _operations1;
    private readonly IVectorOperations<T2> _operations2;
    private readonly IVectorOperations<T3> _operations3;
    private readonly IVectorOperations<T4> _operations4;

    public Mesh()
    {
        _operations1 = VectorOperations.GetFor<T1>();
        _operations2 = VectorOperations.GetFor<T2>();
        _operations3 = VectorOperations.GetFor<T3>();
        _operations4 = VectorOperations.GetFor<T4>();
    }

    public float Average()
    {
        var sum1 = _operations1.Sum(_myVectors1);
        var sum2 = _operations2.Sum(_myVectors2);
        var sum3 = _operations3.Sum(_myVectors3);
        var sum4 = _operations4.Sum(_myVectors4);

        return (float)(sum1 + sum2 + sum3 + sum4) / 
            (_myVectors1.Count + _myVectors2.Count + _myVectors3.Count + _myVectors4.Count);
    }
}

This works and does a typecheck only when instantiating the mesh.这仅在实例化网格时才有效并进行类型检查。 Success.成功。 But we can optimize this further using two tricks.但是我们可以使用两个技巧进一步优化它。

One, we don't need new instances of IVectorOperations<TVector> implementations.一,我们不需要IVectorOperations<TVector>实现的新实例。 We can make them singletons and never instantiate more than one object for one type of vector.我们可以将它们设为单例,并且永远不会为一种类型的向量实例化多个 object。 This is perfectly safe as the implementations are always stateless anyway.这是非常安全的,因为无论如何实现总是无状态的。

public static class VectorOperations
{
    private static VectorOperations1 Implementation1 = new VectorOperations1();
    private static VectorOperations2 Implementation2 = new VectorOperations2();
    ... // etc.

    public static IVectorOperations<TVector> GetFor<TVector>()
    {
        var type = typeof(TVector);

        if (type == typeof(Vector1))
        {
            return (IVectorOperations<TVector>)Implementation1;
        }
        else if (...) // etc.

        throw new ArgumentException($"Invalid type of vector {typeof(TVector).Name}.");
    }
}

Two, we don't really need to typecheck every time we instantiate a new mesh.第二,我们不需要在每次实例化新网格时都进行类型检查。 It's easy to see that the implementations stay the same for every object of a mesh type with equal type arguments.很容易看出,对于具有相同类型 arguments 的网格类型的每个 object,实现保持不变。 They are static in terms of a single closed generic type.就单个封闭泛型而言,它们是 static。 Therefore, we really can make them static:因此,我们真的可以把它们做成 static:

public class Mesh<T1, T2, T3, T4>
{
    private List<T1> _myVectors1;
    private List<T2> _myVectors2;
    private List<T3> _myVectors3;
    private List<T4> _myVectors4;
    private static readonly IVectorOperations<T1> Operations1 =
        VectorOperations.GetFor<T1>();
    private static readonly IVectorOperations<T2> Operations2 =
        VectorOperations.GetFor<T2>();
    private static readonly IVectorOperations<T3> Operations3 =
        VectorOperations.GetFor<T3>();
    private static readonly IVectorOperations<T4> Operations4 =
        VectorOperations.GetFor<T4>();

    public float Average()
    {
        var sum1 = Operations1.Sum(_myVectors1);
        var sum2 = Operations2.Sum(_myVectors2);
        var sum3 = Operations3.Sum(_myVectors3);
        var sum4 = Operations4.Sum(_myVectors4);

        return (float)(sum1 + sum2 + sum3 + sum4) / 
            (_myVectors1.Count + _myVectors2.Count + _myVectors3.Count + _myVectors4.Count);
    }
}

This way, if there are N different vector types, we only ever instantiate N objects implementing IVectorOperations<> and perform exactly as many additional type checks as there are different mesh types, so at most 4^N .这样,如果有N个不同的向量类型,我们只会实例化N个实现IVectorOperations<>的对象,并执行与不同网格类型一样多的额外类型检查,因此最多4^N Individual mesh objects don't take any additional memory, but there are again at most 4^N * 4 references to vector operation implementations.单个网格对象不使用任何额外的 memory,但最多有4^N * 4对矢量操作实现的引用。

This still forces you to implement all the vector operations four times for different types.这仍然迫使您为不同类型实现所有向量操作四次。 But note that now you've unlocked all options - you have a generic interface that depends on the TVector type that you control.但请注意,现在您已解锁所有选项 - 您拥有一个通用界面,该界面取决于您控制的TVector类型。 Any tricks inside your VectorOperations implementations are allowed.允许VectorOperations实现中的任何技巧。 You can be flexible there while being decoupled from the Mesh by the IVectorOperations<TVector> interface.在通过IVectorOperations<TVector>接口与Mesh分离的同时,您可以灵活地使用。

Wow this answer is long.哇这个答案很长。 Thanks for coming to my TED talk!感谢您来参加我的 TED 演讲!

(I don't think this works, but it's the direction I tried to go in at first - comments welcome, maybe it'll inspire someone else to provide a better answer:) (我认为这行不通,但这是我最初尝试 go 的方向 - 欢迎评论,也许它会激励其他人提供更好的答案:)

I thought I could do something like (possible in C++, if I remember correctly, and C# has no direct equivalent for the fully general case, but I figured that for simple cases like this there might be an equivalent):我想我可以做类似的事情(如果我没记错的话,可能在 C++ 中,并且 C# 对于完全一般的情况没有直接的等价物,但我认为对于像这样的简单情况可能有等价物):

public class Mesh<T1,T2>
{
 // This class is basically going to fail at runtime:
 //   it cannot/will not prevent you from instancing it
 //   as - say - a Mesh<string,int> - which simply cannot
 //   be sensibly implemented.
 //
 // So: many methods will throw Exceptions - but some can be implemented
 //   (and hence: shared amongst all the other variants of the class)

     public List<T1> internalList;
     public int CountElements<List<T1>>() { return internalList.Count; }
     public int DoSomethingToList1<T1>() { ... }
}

public class Mesh<Vector2,T2>
{
     // Now we're saying: HEY compiler! I'll manually override the
     //    generic instance of Mesh<T1,T2> in all cases where the
     //    T1 is a Vector2!

     public int DoSomethingToList1<Vector2>() { ... }
}

Or another attempt to find a syntactically valid way to do the same thing (cf @Gserg's comment to the main question) - but obviously this fails because C# compiler forbids arbitrary type-casting:或者另一种尝试找到一种语法上有效的方法来做同样的事情(参见@Gserg对主要问题的评论) - 但显然这失败了,因为 C# 编译器禁止任意类型转换:

    private List<T1> data;
    public void Main()
    {
        if( typeof(T1) == typeof(Vector2) )
            Main( (List<Vector2>) data );
        else if( typeof(T1) == typeof(Vector3) )
            Main( (List<Vector3>) data );
    }

    public void Main( List<Vector2> dataVector2s )
    {
        ...
    }
    public void Main( List<Vector3> dataVector3s )
    {
        ...
    }

I'm not sure if this is what you want, but perhaps you could solve this with a little runtime compilation.我不确定这是否是你想要的,但也许你可以通过一点运行时编译来解决这个问题。 For example, you could generate a delegate that sums the fields of a struct;例如,您可以生成一个对结构的字段求和的委托;

        public Func<T, int> Sum { get; private set; }
        public void Compile()
        {
            var parm = Expression.Parameter(typeof(T), "parm");
            Expression sum = null;

            foreach(var p in typeof(T).GetFields())
            {
                var member = Expression.MakeMemberAccess(parm, p);
                sum = sum == null ? (Expression)member : Expression.Add(sum, member);
            }
            Sum = Expression.Lambda<Func<T, int>>(sum, parm).Compile();
        }

Or perhaps just a method that turns the struct into some other kind of enumerable that's easier to work with.或者也许只是一种将结构转换为其他类型的更易于使用的可枚举的方法。

My advise would be to have Vector2 and Vector3 bring their own processing methods.我的建议是让 Vector2 和 Vector3 自带处理方法。 Interfaces are the droid you are looking for:接口是您正在寻找的机器人:

  • Have them implement a interface with the functions you need让他们实现具有您需要的功能的接口
  • use that interface as the type for anything handed into your list使用该接口作为提交到您列表中的任何内容的类型
  • Call the interface functions调用接口函数

A fitting name for the show process would be "Sumable".显示过程的合适名称是“Sumable”。

Those might be the builtin implementations of the Vector struct .这些可能是Vector struct的内置实现。 Of course those two can not be inherited.当然这两个是不能继承的。 But the MVVM way of doing things is: "If you can not inherit or modify it, wrap it into something you can inherit and modify."但是MVVM的做事方式是:“如果你不能继承或修改它,就将它包装成你可以继承和修改的东西”。

A simple wrapper (it can be a struct or class) around one of those vectors that implements the interface is all you need.一个简单的包装器(它可以是结构或类)围绕实现接口的那些向量之一就是您所需要的。

Another option would be to use LINQ for the processing.另一种选择是使用 LINQ 进行处理。 If it is only a one-off process, it is often a lot more lightweight then going all the way into inhertiance, classes, interfaces and the like.如果它只是一个一次性的过程,那么它通常比继承、类、接口等要轻得多。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM