简体   繁体   中英

Filtering a list based on a dictionary in a fastest way

I am stuck with the performance of the filtering operation of a list based on a Dictionary and need help in identifying the fastest way to do that.

I have a list(also tried Hashset and Dictionary) of strings which I need to filter out based on the values of another data structure Dictionary in this case. I need all the values which are not in the Dictionary. List count is nearby 300000 and the Dictionary count is 200000. When I am using the following code it is taking a hell lot of time. Please provide your inputs how can I improve this operation.

Dictionary<string, string> dictLocalFile - this has around 200000 entries.

var finalFilesHashSet = 
    new HashSet<string>(
        FinalFilesList
       .Where(x => !dictLocalFile.Any(kvp => kvp.Key.Equals(Path.GetFileName(x)))));

You are not using dictionary efficiently. You want to check if dictionary does not contain key. Change your code to use .ContainsKey method. Also you should probably invoke Path.GetFileName(x) in advance out of that loop and measure what is its impact.

var finalFilesHashSet = new HashSet<string>(
    FinalFilesList.Where(x => !dictLocalFile.ContainsKey(Path.GetFileName(x))));

With dictionary code you wrote basically eliminate all performance advantages dictionary gives you: O(1) get operations. Instead you convert it to enumerable and iterate through dictionary content. Built-in method is essentially a hash table lookup

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM