预先收集 C# 的最有效方法

Question

I have a C# Array of objects that needs to stay in order that I'm trying to filter duplicates out of (not same object refference just same attribute values).我有一个 C# 对象数组，需要保留以便我试图过滤掉重复项（不同的 object 引用相同的属性值）。 The catch is the duplicate that has to go is the first one and the oldest needs to stay.问题是重复的 go 是第一个也是最旧的需要留下来。

Current algorithm (semi pseudo code renamed everything) using IEnumerable使用 IEnumerable 的当前算法（半伪代码重命名所有内容）

        object[] filter(object[] array)
        {
            var set = new HashSet<Guid>();
            var filtered = new List<object>();

            for (int i = array.Length; i-- > 0;)
            {
                var item = array[i];

                if (!set.Contains(item.ID))
                {
                    set.Add(item.ID);
                    filtered = new List<object>(filtered.Prepend(item));
                }
            }

            return filtered.ToArray();
        }

I know it is currently O(n) but I am looking for a very efficient way of doing this.我知道它目前是 O(n) 但我正在寻找一种非常有效的方法来做到这一点。 If possible with just arrays so I don't need to use.ToArray() and iterate again.如果可能的话，只需 arrays 所以我不需要使用.ToArray() 并再次迭代。

I could just make filtered an array of size array.length and put it in backwards ie "filtered[array.length-i] = item" but I don't want to have empty values.我可以过滤一个大小为 array.length 的数组并将其向后放入，即“filtered[array.length-i] = item”，但我不想有空值。

Answer 1

Pushing to a stack can be thought of as adding to the start of a list, and popping from a stack can be thought of as removing an item from the start of a list.推入堆栈可以被认为是添加到列表的开头，而从堆栈中弹出可以被认为是从列表的开头删除一个项目。

Stack<T>.Push is a constant time operation as long as the stack has enough capacity, as the documentation says, so you can use a stack instead: Stack<T>.Push是一个常数时间操作，只要堆栈有足够的容量，如文档所述，因此您可以使用堆栈代替：

// using object[] doesn't make sense here as it doesn't have an ID property,
// so I have taken the liberty to create my interface
IHasID[] Filter(IHasID[] array)
{
    var set = new HashSet<Guid>();
    // if not many elements are expected to be filtered, giving the stack a initial capacity might be better
    var filtered = new Stack<IHasID>(/*array.Length*/);

    for (int i = array.Length; i-- > 0;)
    {
        var item = array[i];

        if (set.Add(item.ID))
        {

            filtered.Push(item);
        }
    }

    // ToArray creates an array in the pop order, O(n)
    // https://docs.microsoft.com/en-us/dotnet/api/system.collections.generic.stack-1.toarray?view=net-5.0#remarks
    return filtered.ToArray();
}

interface IHasID
{
    Guid ID { get; }
}

Answer 2

Just use LINQ and it will be single O(n) CPU, O(n) RAM passthrough iterator without any further allocations:只需使用 LINQ ，它将是单个O(n) CPU、 O(n) RAM 直通迭代器，无需任何进一步分配：

var result = input.Reverse().DistinctBy(x=> x.YourKey);

Sample of implementation is here - LINQ's Distinct() on a particular property实现示例在这里 - LINQ's Distinct() on a specific property

You can also do same thing like this, cause all it does is just create group iterators:你也可以像这样做同样的事情，因为它所做的只是创建组迭代器：

var result = input.Reverse().GroupBy(x=> x.YourKey).Select(x=> x.First());

预先收集 C# 的最有效方法

问题描述

2 个解决方案

解决方案1
3 已采纳 2021-03-29 08:56:32

解决方案2
1 2021-03-29 09:15:30

预先收集 C# 的最有效方法

问题描述

2 个解决方案

解决方案1 3 已采纳 2021-03-29 08:56:32

解决方案2 1 2021-03-29 09:15:30

解决方案1
3 已采纳 2021-03-29 08:56:32

解决方案2
1 2021-03-29 09:15:30