简体   繁体   中英

Remove duplicated elements from a List<String>

I would like to remove the duplicate elements from a List. Some elements of the list looks like this:

Book  23
Book  22
Book  19
Notebook 22
Notebook 19
Pen 23
Pen 22
Pen 19

To get rid of duplicate elements i've done this:

List<String> nodup = dup.Distinct().ToList();

I would like to keep in the list just

Book 23
Notebook 22
Pen 23

How can i do that ?

you can do someting like

string firstElement = dup.Distinct().ToList().First();

and add it to another list if you want.

It's not 100% clear what you want here - however...

If you want to keep the "largest" number in the list, you could do:

List<string> noDup = dup.Select(s => s.Split(new[] {' '}, StringSplitOptions.RemoveEmptyEntries)
        .Select(p => new { Name=p[0], Val=int.Parse(p[1]) })
        .GroupBy(p => p.Name)
        .Select(g => string.Join(" ", g.Key, g.Max().ToString()))
        .ToList();

This would transform the List<string> by parsing the numeric portion into a number, taking the max per item, and creating the output string as you have specified.

You can use LINQ in combination with some String operations to group all your itemy by name and MAX(Number) :

var q = from str in list
        let Parts = str.Split(new char[] { ' ' }, StringSplitOptions.RemoveEmptyEntries)
        let item = Parts[ 0 ]
        let num = int.Parse(Parts[ 1 ])
        group new  { Name = item, Number = num } by item into Grp
        select new {
            Name  = Grp.Key,
            Value = Grp.Max(i => i.Number).ToString()
        };

var highestGroups = q.Select(g => 
    String.Format("{0} {1}", g.Name, g.Value)).ToList();

(Same as Reed's approach but in query syntax which is better readable to my mind)

Edit : I cannot reproduce your comment that it does not work, here is sample data:

List<String> list = new List<String>();
list.Add("Book  23");
list.Add("Book  22");
list.Add("Book 19");
list.Add("Notebook  23");
list.Add("Notebook  22");
list.Add("Notebook  19");
list.Add("Pen  23");
list.Add("Pen  22");
list.Add("Pen  19");
list.Add("sheet 3");

var q = from str in list
        let Parts = str.Split(new char[] { ' ' }, StringSplitOptions.RemoveEmptyEntries)
        let item = Parts[ 0 ]
        let num = int.Parse(Parts[ 1 ])
        group new  { Name = item, Number = num } by item into Grp
        select new {
            Name  = Grp.Key,
            Value = Grp.Max(i => i.Number).ToString()
        };

var highestGroups = q.Select(g => String.Format("{0} {1}", g.Name, g.Value));
MessageBox.Show(String.Join(Environment.NewLine, highestGroups));

The result:

Book 23
Notebook 23
Pen 23
sheet 3

You may want to add a custom comparer as a parameter, as you can see in the example on MSDN .

In this example I assumed Foo is a class with two members.

class Program
{
    static void Main(string[] args)
    {
        var list = new List<Foo>()
        {
            new Foo("Book", 23),
            new Foo("Book", 22),
            new Foo("Book", 19)
        };

        foreach(var element in list.Distinct(new Comparer()))
        {
            Console.WriteLine(element.Type + " " + element.Value);
        }
    }
}

public class Foo
{
    public Foo(string type, int value)
    {
        this.Type = type;
        this.Value = value;
    }

    public string Type { get; private set; }

    public int Value { get; private set; }
}

public class Comparer : IEqualityComparer<Foo>
{
    public bool Equals(Foo x, Foo y)
    {
        if(x == null || y == null)
            return x == y;
        else
            return x.Type == y.Type;
    }

    public int GetHashCode(Foo obj)
    {
        return obj.Type.GetHashCode();
    }
}

This works on an IList , assuming that we want the first item each , not the one with the highest number. Be careful with different collection types (like ICollection or IEnumerable ), as they do not guarantee you any order. Therefore any of the Foo s may remain after the Distinct .

You could also override both Equals and GetHashCode of Foo instead of using a custom IEqualityComparer . However, I would not actually recommend this for a local distinct. Consumers of your class may not recognize that two instances with same value for Type are always equal, regardless of their Value .

a bit old fashioned , but it should work , If I understand correctrly

    Dictionary<string,int> dict=new Dictionary<string,int>();

    //Split accepts 1 character ,assume each line containes key value pair seperated with spaces and not containing whitespaces
    input=input.Replace("\r\n","\n");
    string[] lines=input.Split('\n');

    //break to categories and find largest number at each 
    foreach(line in lines)
    {
        string parts[]=line.Split(' ');
        string key=parts[0].Trim();
        int value=Convert.ToInt32(parts[1].Trim());

        if (dict.ContainsKey(key))
        {
            dict.Add(key, value);
        }
        else
        {
            if (dict[key]<value)
            {
                    dict[key]=value;
            }
        }

    }


    //do somethig with dict 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM