简体   繁体   中英

Removing sequential repeating items from List<T> using linq

I'm looking for a way to prevent repeating items in a list but still preserve the order. For example

1, 2, 3, 4, 4, 4, 1, 1, 2, 3, 4, 4 

should become

1, 2, 3, 4, 1, 2, 3, 4

I've done it quite inelegantly using a for loop, checking the next item as follows

    public static List<T> RemoveSequencialRepeats<T>(List<T> input) 
    {
        var result = new List<T>();

        for (int index = 0; index < input.Count; index++)
        {
            if (index == input.Count - 1)
            {
                result.Add(input[index]);
            }
            else if (!input[index].Equals(input[index + 1]))
            {
                result.Add(input[index]);
            }
        }

        return result;
    }

Is there a more elegant way to do this, preferably with LINQ?

You can create extension method:

public static IEnumerable<T> RemoveSequentialRepeats<T>(
      this IEnumerable<T> source)
{
    using (var iterator = source.GetEnumerator())
    {
        var comparer = EqualityComparer<T>.Default;

        if (!iterator.MoveNext())
            yield break;

        var current = iterator.Current;
        yield return current;

        while (iterator.MoveNext())
        {
            if (comparer.Equals(iterator.Current, current))
                continue;

            current = iterator.Current;
            yield return current;
        }
    }        
}

Usage:

var result = items.RemoveSequentialRepeats().ToList();

You can also use pure LINQ :

List<int> list = new List<int>{1, 2, 3, 4, 4, 4, 1, 1, 2, 3, 4, 4};
var result = list.Where((x, i) => i == 0 || x != list[i - 1]);

you could write simple LINQ :

var l = new int[] { 1, 2, 3, 4, 4, 4, 1, 1, 2, 3, 4, 4 };
var k = new Nullable<int>();
var nl = l.Where(x => { var res = x != k; k = x; return res; }).ToArray();

int[8] { 1, 2, 3, 4, 1, 2, 3, 4 }

or pythonic (well, my best try) way:

l.Zip(l.Skip(1), (x, y) => new[] { x, y })
   .Where(z => z[0] != z[1]).Select(a => a[0])
   .Concat(new[] { l[l.Length - 1] }).ToArray()

int[8] { 1, 2, 3, 4, 1, 2, 3, 4 }

the simplest one ( edit: haven't seen that it already suggested by King King )

l.Where((x, i) => i == l.Length - 1 || x != l[i + 1]).ToArray()
int[8] { 1, 2, 3, 4, 1, 2, 3, 4 }

If you want LINQ statement that do not rely on captured value of result inside the call you'll need some construct with aggregate as it is the only method that carries value along with operation. Ie based on Zaheer Ahmed's code:

array.Aggregate(new List<string>(), 
     (items, element) => 
     {
        if (items.Count == 0 || items.Last() != element)
        {
            items.Add(element);
        }
        return items;
     });

Or you can even try to build list without if :

 array.Aggregate(Enumerable.Empty<string>(), 
    (items, element) => items.Concat(
       Enumerable.Repeat(element, 
           items.Count() == 0 || items.Last() != element ? 1:0 ))
    );

Note to get reasonable performance of above samples with Aggregate you'd need to also carry last value ( Last will have to iterate whole sequence on each step), but code that carries 3 values {IsEmpty, LastValue, Sequence} in a Tuple is very strange looking. These samples are here for entertaining purposes only.

One more option is to Zip array with itself shifted by 1 and return elements that are not equal...

More practical option is to build iterator that filters values:

IEnumerable<string> NonRepeated(IEnumerable<string> values)
{
    string last = null;
    bool lastSet = false;

    foreach(var element in values)
    {
       if (!lastSet || last != element)
       {
          yield return element;
       }
       last = element;
       lastSet = true;
    }
 }

If you really really hate the world, pure LINQ:

var nmbs = new int[] { 1, 2, 3, 4, 4, 4, 1, 1, 2, 3, 4, 4, 5 };
var res = nmbs
              .Take(1)
              .Concat(
                      nmbs.Skip(1)
                          .Zip(nmbs, (p, q) => new { prev = q, curr = p })
                          .Where(p => p.prev != p.curr)
                          .Select(p => p.curr));

But note that you'll need to enumerate (at least partially) the enumerable 3 times (the Take , the "left" part of Zip , the first parameters of Zip ). This method is slower than building a yield method or doing it directly .

Explanation:

  • You take the first number ( .Take(1) )
  • You take all the numbers from the second ( .Skip(1) ) and pair it with all the numbers ( .Zip(nmbs ). We will call curr the numbers from the first "collection" and prev the numbers from the second "collection" ( (p, q) => new { prev = q, curr = p }) ). You then take only the numbers that are different from the previous number ( .Where(p => p.prev != p.curr) ) and from these you take the curr value and discard the prev value ( .Select(p => p.curr) )
  • You concat these two collections ( .Concat( )

check if last of new list and current item is not same then add to new list:

List<string> results = new List<string>();
results.Add(array.First());
foreach (var element in array)
{
    if(results[results.Length - 1] != element)
        results.Add(element);
}

or using LINQ:

List<int> arr=new List<int>(){1, 2, 3, 4, 4, 4, 1, 1, 2, 3, 4, 4 };
List<int> result = new List<int>() { arr.First() };
arr.Select(x =>
               {
                if (result[result.Length - 1] != x) result.Add(x);
                    return x;
               }).ToList();

Do have proper validation for null object.

Try this:

class Program
{
    static void Main(string[] args)
    {
        var input = "1, 2, 3, 4, 4, 4, 1, 1, 2, 3, 4, 4 ";
        var list = input.Split(',').Select(i => i.Trim());

        var result = list
            .Select((s, i) => 
                (s != list.Skip(i + 1).FirstOrDefault()) ? s : null)
            .Where(s => s != null)
            .ToList();
    }
}

Here the code you need :

public static List<int> RemoveSequencialRepeats(List<int> input)
{
     var result = new List<int>();

     result.Add(input.First());
     result.AddRange(input.Where(p_element => result.Last() != p_element);
     return result;
 }

The LINQ magic is:

 result.Add(input.First());
 result.AddRange(input.Where(p_element => result.Last() != p_element);

Or you can create extension method like this:

public static class Program
{

    static void Main(string[] args)
    {       
        List<int> numList=new List<int>(){1,2,2,2,4,5,3,2};

        numList = numList.RemoveSequentialRepeats();
    }

    public static List<T> RemoveSequentialRepeats<T>(this List<T> p_input)
    {
        var result = new List<T> { p_input.First() };

        result.AddRange(p_input.Where(p_element => !result.Last().Equals(p_element)));

        return result;
    }
}

If you feel like referencing an F# project you can write

let rec dedupe = function
  | x::y::rest when x = y -> x::dedupe rest
  | x::rest -> x::dedupe rest
  | _ -> []

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM