简体   繁体   English

在信息对象列表上调用linq的最有效方法?

[英]Most efficient way to call linq on a list of information objects?

I'm trying to get data deep in my informational object out of a list of library objects to which I have attached them. 我正在尝试从我附加它们的库对象列表中获取我的信息对象中的数据。 The two solutions I have both seem very inefficient. 我有两个解决方案似乎都效率很低。 Is there any way to reduce this to a single OfType call without the linq query being the longer variant? 有没有办法将这个减少到一个OfType调用而没有linq查询是更长的变体?

using System;
using System.Collections.Generic;
using System.Linq;

namespace LinqQueries
{

    // Test the linq queries
    public class Test
    {
        public void TestIt()
        {
            List<ThirdParty> As = new List<ThirdParty>();

            // This is nearly the query I want to run, find A and C where B 
            // and C match criteria
            var cData = from a in As
                        from b in a.myObjects.OfType<MyInfo>()
                        where b.someProp == 1
                        from c in b.cs
                        where c.data == 1
                        select new {a, c};

            // This treats A and B as the same object, which is what I
            // really want, but it calls two sub-queries under the hood, 
            // which seems less efficient 
            var cDataShorter = from a in As
                               from c in a.GetCs()
                               where a.GetMyProp() == 1
                               where c.data == 1
                               select new { a, c };
        }
    }

    // library class I can't change
    public class ThirdParty
    {
        // Generic list of objects I can put my info object in
        public List<Object> myObjects;
    }

    // my info class that I add to ThirdParty
    public class MyInfo
    {
        public List<C> cs;
        public int someProp;
    }

    // My extension method for A to simplify some things.
    static public class MyExtentionOfThirdPartyClass
    {
        // Get the first MyInfo in ThirdParty
        public static MyInfo GetB(this ThirdParty a)
        {
            return (from b in a.myObjects.OfType<MyInfo>()
                    select b).FirstOrDefault();
        }

        // more hidden linq to slow things down...
        public static int GetMyProp(this ThirdParty a)
        {
            return a.GetB().someProp;
        }

        // get the list of cs with hidden linq
        public static List<C> GetCs(this ThirdParty a)
        {
            return a.GetB().cs;
        }
    }

    // fairly generic object with data in it
    public class C
    {
        public int data;
    }
}

If you are saying your cDataShorter is producing a correct result, then you can rewrite it like this: 如果你说你的cDataShorter产生了正确的结果,那么你可以像这样重写它:

As.SelectMany(a => a.myObjects, (aa, mo) => new R {Tp = aa, Mi = mo as MyInfo})
  .Where(r => r.Mi != null && r.Mi.someProp == 1)
  //.Distinct(new Comparer<R>((r1, r2) => r1.Tp.Equals(r2.Tp))) 
  // If you need only one (first) MyInfo from a ThirdParty 
  // You don't need R if you're not going to use Distinct, just use an anonymous
  .SelectMany(r => r.Mi.cs, (rr, c) => new {a = rr.Tp, c})
  .Where(ao => ao.c.data == 1)      

public class R {
    public ThirdParty Tp;
    public MyInfo Mi;
}

For simplicity, Comparer is from there 为简单起见, Comparer来自那里

Unfortunately the answer is "It Depends". 不幸的是答案是“它取决于”。 I had to write the query both ways and do timing runs on it. 我必须以两种方式编写查询并对其进行计时运行。

1000 Third Party objects, 1 MyObject each, 1000 c each, all results match criteria, first query is twice as fast. 1000个第三方对象,每个MyObject 1个,每个1000 c,所有结果匹配条件,第一个查询速度是其两倍。 If no MyObjects match the criteria, Query 1 is two orders of magnitude faster. 如果没有MyObjects符合条件,则查询1的速度提高两个数量级。 However, if you have multiple MyObjects, the efficiency reverses, 100 ThirdParty, 100 MyObjects each, 100 C each, all results matching, the second query is two orders of magnitude faster than the first. 但是,如果你有多个MyObjects,效率会反转,100 ThirdParty,每个100 MyObjects,每个100 C,所有结果匹配,第二个查询比第一个查询快两个数量级。 No MyObjects matching, the first comes out faster again. 没有MyObjects匹配,第一次更快出来。

I actually ended up implementing the slower solution because it made the code cleaner and the performance of the slower query was not all that bad. 我实际上最终实现了较慢的解决方案,因为它使代码更清晰,慢速查询的性能并不是那么糟糕。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM