简体   繁体   English

查询集合的最有效方法-C#

[英]most efficient way to query a collection - c#

I'm searching through a generic list (or IQueryable) which contains 3 columns. 我正在搜索包含3列的通用列表(或IQueryable)。 I'm trying to find the value of the 3 column, based on 1 and 2, but the search is really slow. 我正在尝试基于1和2找到3列的值,但是搜索确实很慢。 For a single search, the speed isn't noticeable, but I'm performing this search on a loop, and for 700 iterations, it takes a combined time of over 2 minutes, which isn't any use. 对于单个搜索,速度并不明显,但是我正在循环执行此搜索,并且对于700次迭代而言,这花费了超过2分钟的总时间,这毫无用处。 Columns 1 and 2 are int and column 3 is a double . 第1列和第2 int ,第3 double Here is the linq I'm using: 这是我正在使用的linq:

public static Distance FindByStartAndEnd(int start, int end, IQueryable<Distance> distanceList)
{
    Distance item = distanceList.Where(h => h.Start == start && h.End == end).FirstOrDefault();
    return item ;
}

There could be up do 60,000 entries in the IQueryable list. IQueryable列表中最多可以有60,000个条目。 I know that is quite a lot, but I didn't think it would pose any problem for searching. 我知道很多,但是我认为这不会对搜索造成任何问题。

So my question is, is there a better way to search through a collection when needing to match 2 columns to get value of a third? 所以我的问题是,当需要匹配两列以获取第三列的值时,是否有更好的方法来搜索集合? I guess I need all 700 searches to be almost instant, but it takes about 300ms for each which soon mounts up. 我想我需要几乎所有700次搜索都可以立即进行,但是每次搜索大约需要300毫秒,这很快就会开始。

UPDATE - Final Solution ####################### 更新-最终解决方案#######################

I've now created a dictionary using Tuple with start and end as the key. 现在,我使用Tuple创建了一个字典,以startend作为键。 I think this could be the right solution. 我认为这可能是正确的解决方案。

var dictionary = new Dictionary<Tuple<int, int>, double>();

var key = new Tuple<int, int>(Convert.ToInt32(reader[0]), Convert.ToInt32(reader[1]));
var value = Convert.ToDouble(reader[2]);

if (value <= distance)
{
    dictionary.Add(key, value);
}
var key = new Tuple<int, int>(5, 20);

Works fine - much faster 效果很好-更快

Create a dictionary where columns 1 and 2 create the key. 创建一个字典,其中第1列和第2列创建密钥。 You create the dictionary once and then your searches will be almost instant. 您创建字典一次,然后您的搜索将几乎是立即的。

If you have control over your collection and model classes, there is a library which allows you to index the properties of the class, which can greatly speed up searching. 如果您可以控制集合和模型类,则可以使用一个库来索引该类的属性,从而可以大大加快搜索速度。

http://i4o.codeplex.com/ http://i4o.codeplex.com/

Your problem is that LINQ has to execute the expression tree everytime you return the item. 您的问题是,LINQ每次返回项目时都必须执行表达式树。 Just call this method with multiple start and end values 只需使用多个开始和结束值调用此方法

public static IEnumerable<Distance> FindByStartAndEnd
    (IEnumerable<KeyValuePair<int, int>> startAndEnd,
    IQueryable<Distance> distanceList)
{

    return
        from item in distanceList
        where 
            startAndEnd.Select(s => s.Key).Contains(item.Start)
            && startAndEnd.Select(s => s.Value).Contains(item.End)
        select item;
}

I'd give a hashSet a try. 我试一试hashSet This should speed up things ;) 这应该加快速度;)

Create a single value out of the first two columns, for example by concatenating them into a long , and use that as a key in a dictionary: 在前两列中创建一个值,例如,将它们串联为long ,并将其用作字典中的键:

public long Combine(int start, int end) {
  return ((long)start << 32) | end;
}

Dictionary<long, Distance> lookup = distanceList.ToDictionary(h => Combine(h.Start, h.End));

Then you can look up the value: 然后,您可以查找值:

public static Distance FindByStartAndEnd(int start, int end, IQueryable<Distance> distanceList) {
  Distance item;
  if (!lookup.TryGetValue(Combine(start, end), out item) {
    item = null;
  }
  return item;
}

Getting an item from a dictionary is close to an O(1) operaton, which should make a dramatic difference from the O(n) operaton to loop through the items to find one. 从字典中获取一项非常接近O(1)运算符,这与O(n)运算符循环遍历这些项以找到一个项有很大的不同。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM