简体   繁体   English

如何从 SortedDictionary 中获取离我的键最近的项目?

[英]How to get the closest item to my key from a SortedDictionary?

Currently I'm using a binary search over a SortedList<T,U> for a specific number, and if it doesn't exist I get the closest lower-bound key-item instead.目前,我在SortedList<T,U>上使用二进制搜索来查找特定数字,如果它不存在,我将获得最接近的下限键项。

I saw that it was rather slow in inserting unsorted data which I'm doing a lot of.我看到插入未排序的数据相当慢,我经常这样做。

Is there a way to do something similar with the SortedDictionary , or should I just stick to my SortedList ?有没有办法用SortedDictionary做类似的事情,还是我应该坚持我的SortedList

SortedList<K, V> is really slow when inserting data as it shifts <=N elements in internal array each time a new element is added. SortedList<K, V>在插入数据时非常慢,因为每次添加新元素时它都会在内部数组中移动<=N个元素。 The complexity of addition is O(N) .加法的复杂度是O(N) Nevertheless it supports binary search which allows to find exact element or its neighbors in O(log N) .尽管如此,它支持二进制搜索,允许在O(log N)中找到确切的元素或其邻居。

Balanced binary tree is the best data structure to solve your problem.平衡二叉树是解决问题的最佳数据结构。 You'll be able to to do the following operations w/ logarithmic complexity:您将能够执行以下具有对数复杂度的操作:

  1. Add item in O(log N) vs. O(N) in SortedList<K, V>SortedList<K, V>中添加O(log N)O(N)中的项目
  2. Remove item in O(log N)删除O(log N)中的项目
  3. Search item or its nearest in O(log N)O(log N)中搜索项目或其最近的项目

Looking for element or its nearest lower-bound in binary tree is simple:在二叉树中寻找元素或其最近的下界很简单:

  1. Go vertically through the tree from root to child in order to find your key.从根到子垂直遍历树以找到您的密钥。 If key < node, then go to left child, otherwise to the right one.如果 key < 节点,则转到左孩子,否则转到右孩子。
  2. If you found the key, return如果找到钥匙,请返回
  3. If key not found, nearest left parent will be the one you are looking for (nearest lower-bound)如果未找到密钥,则最近的左父母将是您正在寻找的那个(最近的下限)
  4. If there is no left parents, just take the last visited node, it is minimal node in the tree.如果没有左父节点,只取最后访问的节点,它是树中最小的节点。

There are many articles describing how to implement binary tree.有很多文章描述了如何实现二叉树。 Nevertheless I'm going to reuse .NET Framework collection using a kind of hack :)不过,我将使用一种 hack 重用 .NET Framework 集合:)

Now, I'm gonna present to you SortedSet<T> which itself is red-black tree.现在,我将向您介绍SortedSet<T> ,它本身就是红黑树。 It has one drawback, it has no ability to find nearest nodes quickly.它有一个缺点,它无法快速找到最近的节点。 But we know the algorithm of search in tree (it's described in 1.) and it is implemented in SortedSet<T>.Contains method (decompiled at the bottom*).但是我们知道在树中搜索的算法(在 1. 中有描述),它是在SortedSet<T>.Contains方法中实现的(在底部反编译*)。 Now we can capture all nodes from root to the last visited node during traversal using our custom comparer.现在我们可以使用我们的自定义比较器在遍历期间捕获从根到最后访问的节点的所有节点。 After that we can find nearest lower-bound node using algorithm above:之后我们可以使用上面的算法找到最近的下界节点:

public class LowerBoundSortedSet<T> : SortedSet<T> {

    private ComparerDecorator<T> _comparerDecorator;

    private class ComparerDecorator<T> : IComparer<T> {

        private IComparer<T> _comparer;

        public T LowerBound { get; private set; }

        private bool _reset = true;

        public void Reset()
        {
            _reset = true;
        }

        public ComparerDecorator(IComparer<T> comparer)
        {
            _comparer = comparer;
        }

        public int Compare(T x, T y)
        {
            int num = _comparer.Compare(x, y);
            if (_reset)
            {
                LowerBound = y;
            }
            if (num >= 0)
            {
                LowerBound = y;
                _reset = false;
            }
            return num;
        }
    }

    public LowerBoundSortedSet()
        : this(Comparer<T>.Default) {}

    public LowerBoundSortedSet(IComparer<T> comparer)
        : base(new ComparerDecorator<T>(comparer)) {
        _comparerDecorator = (ComparerDecorator<T>)this.Comparer;
    }

    public T FindLowerBound(T key)
    {
        _comparerDecorator.Reset();
        this.Contains<T>(key);
        return _comparerDecorator.LowerBound;
    }
}

You see that finding nearest node takes no more than usual search, ie O(log N) .您会看到找到最近的节点并不比通常的搜索多,即O(log N) So, this is the fastest solution for your problem.因此,这是解决您问题的最快方法。 This collection is as fast as SortedList<K, V> in finding nearest and is as fast as SortedSet<T> in addition.此集合在查找最近的方面与SortedList<K, V>一样快,此外与SortedSet<T>一样快。

What about SortedDictionary<K, V> ? SortedDictionary<K, V>呢? It is almost the same as SortedSet<T> except one thing: each key has a value.它与SortedSet<T>几乎相同,除了一件事:每个键都有一个值。 I hope you will be able to do the same with SortedDictionary<K, V> .我希望您能够对SortedDictionary<K, V>做同样的事情。

*Decompiled SortedSet<T>.Contains method: *反编译SortedSet<T>.Contains方法:

public virtual bool Contains(T item)
{
  return this.FindNode(item) != null;
}

internal virtual SortedSet<T>.Node FindNode(T item)
{
  for (SortedSet<T>.Node node = this.root; node != null; {
    int num;
    node = num < 0 ? node.Left : node.Right;
  }
  )
  {
    num = this.comparer.Compare(item, node.Item);
    if (num == 0)
      return node;
  }
  return (SortedSet<T>.Node) null;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM