简体   繁体   English

删除列表<t>多次出现的元素,就地</t>

[英]Remove List<T> elements that appear more than once, in place

There is a similar question posted , but I do not have the rep to ask a follow-up question in that thread. 发布了一个类似的问题,但我没有代表在该线程中提出后续问题。 :( :(

If I have a List<T> that contains items that appear more than once, List.Distinct() will remove duplicates, but the original will still remain in place.如果我的List<T>包含多次出现的项, List.Distinct()将删除重复项,但原始项仍将保留在原处。 If I want to remove items that occur more than once, including the original, what would be the most efficient way to do this to the original list?如果我想删除多次出现的项目,包括原始项目,对原始列表执行此操作的最有效方法是什么?

Given a List<int> called oneTime :给定一个名为oneTimeList<int>

{ 4, 5, 7, 3, 5, 4, 2, 4 }

The desired output would be in oneTime :所需的 output 将在oneTime中:

{ 7, 3, 2 }

Follow up question for @Enigmativity:跟进@Enigmativity 的问题:

Here is a pseudo version of what my script is doing.这是我的脚本正在执行的操作的伪版本。 It is done in NinjaTrader which runs on .NET3.5.它是在运行于 .NET3.5 上的 NinjaTrader 中完成的。

I will attach a general idea of what the code is supposed to be doing, I'd attach the actual script but unless using NinjaTrader, it might not be of use.我将附上代码应该做什么的一般概念,我会附上实际的脚本,但除非使用 NinjaTrader,否则它可能没有用。

But essentially, there is a large z loop.但本质上,有一个大的 z 循环。 Each time through, a series of numbers is added to 'LiTics.'每次通过时,一系列数字都会添加到“LiTics”中。 Which I do not want to disturb.我不想打扰。 I then pass that list to the function, and return a list of values that only occur once.然后我将该列表传递给 function,并返回只出现一次的值列表。 Then I'd like to see those numbers each time through the loop.然后我想在循环中每次都看到这些数字。

It works initially, but running this on various sets of data, after a few passes through the loop, it start reporting values that occur more than once.它最初工作,但在各种数据集上运行它,在循环几次之后,它开始报告多次出现的值。 I'm not sure why exactly?我不确定为什么?

for(int z=1; z<=10000; z +=1)//Runs many times 
{ 
    if (BarsInProgress ==0 &&CurrentBar-oBarTF1>0 &&startScript )   //Some Condition
    {
        for(double k=Low[0]; k<=High[0]; k +=TickSize)  
        {   
            LiTics.Add(k);  
            //Adds a series of numbers to this list each time through z loop
            //This is original that I do not want to disturb
        }

        LiTZ.Clear();  //Display list to show me results Clear before populating
        LiTZ=GetTZone(LiTics); //function created in thread(below)
                               //Passing the undisturbed list that is modified on every loop
        foreach (double prime in LiTZ) { Print(Times[0] +",  " +prime);  }
        //Printing to see results   
    }

}//End of bigger 'z' loop

//Function created to get values that appear ONLY once
public List<double> GetTZone(List<double> sequence) 
{  
    var result =
        sequence
            .GroupBy(x => x)
            .Where(x => !x.Skip(1).Any())
            .Select(x => x.Key)
            .ToList();
    return result;
}

A picture of the print out and what is going wrong: Screenshot .打印出来的图片和出了什么问题:屏幕截图

So, if you can have a new list, then this is the easiest way to do it: 因此,如果您可以拥有一个新列表,那么这是最简单的方法:

var source = new List<int>() { 4, 5, 7, 3, 5, 4, 2, 4 };

var result =
    source
        .GroupBy(x => x)
        .Where(x => !x.Skip(1).Any())
        .Select(x => x.Key)
        .ToList();

This gives: 这给出:

{ 7, 3, 2 }

If you want to remove the values from the original source, then do this: 如果要从原始源中删除值,请执行以下操作:

var duplicates =
    new HashSet<int>(
        source
            .GroupBy(x => x)
            .Where(x => x.Skip(1).Any())
            .Select(x => x.Key));

source.RemoveAll(n => duplicates.Contains(n));

Here is an extension method for the List<T> class, that removes from the list all the items that appear more than once:下面是List<T> class 的扩展方法,它从列表中删除出现不止一次的所有项目:

/// <summary>
/// Removes all the elements that have a key that appears more than once,
/// according to a specified key selector function.
/// </summary>
public static int RemoveDuplicatesByKey<TSource, TKey>(this List<TSource> list,
    Func<TSource, TKey> keySelector,
    IEqualityComparer<TKey> comparer = default)
{
    ArgumentNullException.ThrowIfNull(list);
    ArgumentNullException.ThrowIfNull(keySelector);
    Dictionary<TKey, int> occurences = new(list.Count, comparer);
    foreach (TSource item in list)
        CollectionsMarshal.GetValueRefOrAddDefault(
            occurences, keySelector(item), out _)++;
    return list.RemoveAll(item => occurences[keySelector(item)] > 1);
}

The occurrences of each element are counted with a Dictionary<TKey, int> , using the CollectionsMarshal.GetValueRefOrAddDefault method (.NET 6) for efficiency.使用CollectionsMarshal.GetValueRefOrAddDefault方法 (.NET 6) 来计算每个元素的出现次数Dictionary<TKey, int>以提高效率。

Usage example:使用示例:

List<int> list = new() {4, 5, 7, 3, 5, 4, 2, 4};
Console.WriteLine($"Before: [{String.Join(", ", list)}]");
int removedCount = list.RemoveDuplicatesByKey(x => x);
Console.WriteLine($"After: [{String.Join(", ", list)}], Removed: {removedCount}");

Output: Output:

Before: [4, 5, 7, 3, 5, 4, 2, 4]
After: [7, 3, 2], Removed: 5

Online Demo .在线演示

I have two options for you, one that uses HashSet and other Linq . 我为您提供两种选择,一种使用HashSet ,另一种使用Linq

Option 1: 选项1:

Using HashSet , loop through collection and insert if it not exist and remove if it exists. 使用HashSet ,循环遍历集合,如果不存在则插入,如果存在则删除。

HashSet<int> hash = new HashSet<int>();

foreach(var number in list)
{
    if(!hash.Contains(number)) hash.Add(number);
    else hash.Remove(number);               
}
list = hash.ToList();

Option 2: 选项2:

Simple Linq, group the elements and filter whose count >1 . 简单的Linq,对元素进行分组并过滤计数>1的元素。

var list= list.GroupBy(g=>g)
    .Where(e=>e.Count()==1)
    .Select(g=>g.Key)
    .ToList();

There is big performance gain using HashSet over Linq , it is obvious, Linq (in this case) require multiple iterations, where as HashSet uses single iteration and provides LookUp (for adding/removing) with O(1) access. 存在使用的性能增益HashSet超过Linq ,很明显, Linq (在这种情况下)需要多次迭代,其中作为HashSet使用单次迭代,并且提供查找(用于添加/删除)与O(1)的访问。

Elapsed Time (Using Linq): 8808 Ticks
Elapsed Time (Using HashSet): 51 Ticks

Working Demo 工作Demo

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM