简体   繁体   中英

C# return generic list of objects using linq

i got a generic list that looks like this:

List<PicInfo> pi = new List<PicInfo>();

PicInfo is a class that looks like this:

[ProtoContract]
public class PicInfo
{
        [ProtoMember(1)]
        public string fileName { get; set; }
        [ProtoMember(2)]
        public string completeFileName { get; set; }
        [ProtoMember(3)]
        public string filePath { get; set; }
        [ProtoMember(4)]
        public byte[] hashValue { get; set; }

        public PicInfo() { } 
}

what i'm trying to do is:

  • first, filter the list with duplicate file names and return the duplicate objects;
  • than, filter the returned list with duplicate hash value's;

i can only find examples on how to do this which return anonymous types. but i need it to be a generic list.

if someone can help me out, I'd appreciate it. also please explain your code. it's a learning process for me.

thanks in advance!

[EDIT]

the generic list contains a list of objects. these objects are pictures. every picture has a file name, hash value (and some more data which is irrelevant at this point). some pictures have the same name (duplicate file names). and i want to get a list of the duplicate file names from this generic list 'pi'.

But those pictures also have a hash value. from the file names that are identical, i want another list of those identical files names that also have identical hash values.

[/EDIT]

Something like this should work. Whether it is the best method I am not sure. It is not very efficient because for each element you are iterating through the list again to get the count.

List<PicInfo> pi = new List<PicInfo>();
IEnumerable<PicInfo> filt = pi.Where(x=>pi.Count(z=>z.FileName==x.FileName)>1);

I hope the code isn't too complicated to need explaining. I always think its best to work it out on your own anyway but if anythign is confusing then just ask and I'll explain.

If you want the second filter to be filtering for the same filename and same hash being a duplicate then you just need to extend the lambda in the Count to check against hash too.

Obviously if you just want filenames at the end then it is easy enough to do a Select to get just an enumerable list of those filenames, possibly with a Distinct if you only want them to appear once.

NB. Code written by hand so do forgive typos. May not compile first time, etc. ;-)

Edit to explain code - spoilers! ;-)

In english what we want to do is the following:

for each item in the list we want to select it if and only if there is more than one item in the list with the same filename.

Breaking this down to iterate over the list and select things based on a criteria we use the Where method. The condition of our where method is

there is more than one item in the list with the same filename

for this we clearly need to count the list so we use pi.Count. However we have a condition that we are only counting if the filename matches so we pass in an expression to tell it only to count those things.

The expression will work on each item of the list and return true if we want to count it and false if we don't want to.

The filename we are interested in is on x, the item we are filtering. So we want to count how many items have a filename the same as x.FileName. Thus our expression is z=>z.FileName==x.FileName . So z is our variable in this expression and x.FileName in this context is unchanging as we iterate over z.

We then of course put our criteria in of >1 to get the boolean value we want.

If you wanted those that are duplicates when considering the filename and hashvalue then you would expand the part in the Count to be z=>z.FileName==x.FileName && z.hashValue==x.hashValue .

So your final code to get the distinct on both values would be:

List pi = new List();
List filt = pi.Where(x=>pi.Count(z=>z.FileName==x.FileName && z.hashValue==x.hashValue)>1).ToList();

If you wanted those that are duplicates when considering the filename and hashvalue then you would expand the part in the Count to compare the hashValue as well. Since this is an array you will want to use the SequenceEqual method to compare them value by value.

So your final code to get the distinct on both values would be:

List<PicInfo> pi = new List<PicInfo>();
List<PicInfo> filt = pi.Where(x=>pi.Count(z=>z.FileName==x.FileName && z.hashValue.SequenceEqual(x.hashValue))>1).ToList();

Note that I didn't create the intermediary list and just went straight from the original list. You could go from the intermediate list but the code would be much the same if going from the original as from a filtered list.

I think, you have to use SequenceEqual method for finding dublicate (http://msdn.microsoft.com/ru-ru/library/bb348567.aspx). For filter use

        var p = pi.GroupBy(rs => rs.fileName) // group by name
            .Where(rs => rs.Count() > 1) // find group whose count greater than 1
            .Select(rs => rs.First()) // select 1st element from each group
            .GroupBy(rs => rs.hashValue) // now group by hash value
            .Where(rs => rs.Count() > 1) // find group has multiple values
            .Select(rs => rs.First()) // select first element from group
            .ToList<PicInfo>() // make the list of picInfo of result

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM