简体   繁体   English

以最快的方式基于字典过滤列表

[英]Filtering a list based on a dictionary in a fastest way

I am stuck with the performance of the filtering operation of a list based on a Dictionary and need help in identifying the fastest way to do that. 我对基于字典的列表过滤操作的性能感到困惑,并且需要帮助来确定执行此操作的最快方法。

I have a list(also tried Hashset and Dictionary) of strings which I need to filter out based on the values of another data structure Dictionary in this case. 我有一个字符串列表(也尝试过Hashset和Dictionary),在这种情况下,我需要根据另一个数据结构Dictionary的值过滤掉这些字符串。 I need all the values which are not in the Dictionary. 我需要不在词典中的所有值。 List count is nearby 300000 and the Dictionary count is 200000. When I am using the following code it is taking a hell lot of time. 列表数在300000附近,字典数在200000。当我使用以下代码时,这要花费很多时间。 Please provide your inputs how can I improve this operation. 请提供您的意见,我将如何改进此操作。

Dictionary<string, string> dictLocalFile - this has around 200000 entries.

var finalFilesHashSet = 
    new HashSet<string>(
        FinalFilesList
       .Where(x => !dictLocalFile.Any(kvp => kvp.Key.Equals(Path.GetFileName(x)))));

You are not using dictionary efficiently. 您没有有效地使用字典。 You want to check if dictionary does not contain key. 您要检查字典是否不包含键。 Change your code to use .ContainsKey method. 更改您的代码以使用.ContainsKey方法。 Also you should probably invoke Path.GetFileName(x) in advance out of that loop and measure what is its impact. 另外,您可能应该提前从该循环中调用Path.GetFileName(x)并衡量其影响。

var finalFilesHashSet = new HashSet<string>(
    FinalFilesList.Where(x => !dictLocalFile.ContainsKey(Path.GetFileName(x))));

With dictionary code you wrote basically eliminate all performance advantages dictionary gives you: O(1) get operations. 通过编写字典代码,基本上消除了字典的所有性能优势。字典为您提供:O(1)get操作。 Instead you convert it to enumerable and iterate through dictionary content. 相反,您将其转换为可枚举并通过字典内容进行迭代。 Built-in method is essentially a hash table lookup 内置方法本质上是哈希表查找

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM