简体   繁体   中英

Remove elements in a list considering duplicated subelements

I need to remove elements in a single list considering one or more duplicated subelement

Classes

public class Person
{
    public int id { get; set; }
    public string name { get; set; }
    public List<IdentificationDocument> documents { get; set; }

    public Person()
    {
        documents = new List<IdentificationDocument>();
    }
}

public class IdentificationDocument
{
    public string number { get; set; }
}

Code:

        var person1 = new Person() {id = 1, name = "Bob" };
        var person2 = new Person() {id = 2, name = "Ted" };
        var person3 = new Person() {id = 3, name = "Will_1" };
        var person4 = new Person() {id = 4, name = "Will_2" };

        person1.documents.Add(new IdentificationDocument() { number = "123" });
        person2.documents.Add(new IdentificationDocument() { number = "456" });
        person3.documents.Add(new IdentificationDocument() { number = "789" });
        person4.documents.Add(new IdentificationDocument() { number = "789" }); //duplicate

        var personList1 = new List<Person>();

        personList1.Add(person1);
        personList1.Add(person2);
        personList1.Add(person3);
        personList1.Add(person4);

        //more data for performance test
        for (int i = 0; i < 20000; i++)
        {
            var personx = new Person() { id = i, name = Guid.NewGuid().ToString() };
            personx.documents.Add(new IdentificationDocument() { number = Guid.NewGuid().ToString() });
            personx.documents.Add(new IdentificationDocument() { number = Guid.NewGuid().ToString() });
            personList1.Add(personx);
        }

        var result = //Here comes the linq query

        result.ForEach(r => Console.WriteLine(r.id + " " +r.name));

Expected result:

1 Bob
2 Ted
3 Will_1

Example

https://dotnetfiddle.net/LbPLcP

Thank you!

You can use the Enumerable.Distinct<TSource> method from LINQ. You'll need to create a custom comparer to compare using the subelement.

See How do I use a custom comparer with the Linq Distinct method?

Well, yes, you could use a custom comparer. But that's going to be lots more code than your specific example requires. If your specific example is all you need, this this will work fine:

var personDocumentPairs = personList1
    .SelectMany(e => e.documents.Select(t => new {person = e, document = t}))
    .GroupBy(e => e.document.number).Select(e => e.First());
var result = personDocumentPairs.Select(e => e.person).Distinct();

along the lines of Adam's solution the trick is to iterate persons and group them by associated document numbers.

// persons with already assigned documents
// Will_2
var duplicate = from person in personList1
                from document in person.documents
                group person by document.number into groupings
                let counter = groupings.Count()
                where counter > 1
                from person in groupings
                    .OrderBy(p => p.id)
                    .Skip(1)
                select person;

// persons without already assigned documents
// Bob
// Ted
// Will_1
var distinct = from person in personList1
               from document in person.documents
               group person by document.number into groupings
               from person in groupings
                   .OrderBy(p => p.id)
                   .Take(1)
               select person;

the orderby is a made up rule for the already assigned documents persons , but your mileage may vary

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM