简体   繁体   English

比较两个包含很多对象的列表(第3部分)“那些对象的类型不同”

[英]Compare two lists that contain a lot of objects (3th part) “those objects have different type”

How could I speed up this linq query? 我如何加快此linq查询?

It takes a long time and when I place a lot of objects in the list I get a memory exception. 这需要很长时间,当我在列表中放置许多对象时,会出现内存异常。

List<DirectoryInfo> directoriesThatWillBeCreated = new List<DirectoryInfo>();
// some code to fill the list
// ..
// ..

List<FileInfo> FilesThatWillBeCopied = new List<FileInfo>();
// some code to fill the list
//....

directoriesThatWillBeCreated = (from a in FilesThatWillBeCopied
                                from b in directoriesThatWillBeCreated
                                where a.FullName.Contains(b.FullName)
                                select b).ToList();

I hope I can do something like previous solution but I don't know how to do that when dealing with different types of objects. 我希望我可以做类似以前的解决方案,但是在处理不同类型的对象时我不知道该怎么做。 Do I have to create a new class then convert all the FileInfo and DirectoryInfo objects to that class then perform the query? 我是否必须创建一个新类,然后将所有FileInfo和DirectoryInfo对象转换为该类,然后执行查询? Moreover FileInfo and DirectoryInfo classes are sealed and I cannot inherit from them therefore I'll have to create a new class and that will be not to efficient. 而且FileInfo和DirectoryInfo类是密封的,我不能从它们继承,因此我将不得不创建一个新类,这样效率不高。 At least that will be more efficient than that query because that query takes forever. 至少那将比该查询更有效,因为该查询将花费很长时间。

It's slow because the code does linear search in directory list for each file. 这很慢,因为代码会在目录列表中为每个文件进行线性搜索。 Try this: 尝试这个:

var dirlist = FilesThatWillBeCopied
    .Select(f => Directory.GetParent(f.FullName))
    .GroupBy(d => d.FullName)

You may need to play with the syntax a little bit but hopefully you see the point. 您可能需要稍微使用一下语法,但希望您能明白这一点。

One thing you could do is change the Contains to a StartsWith . 您可以做的一件事是将Contains更改为StartsWith StartsWith will fail faster in the event of a failed match. 如果匹配失败, StartsWith将更快地失败。

directoriesThatWillBeCreated = (from a in FilesThatWillBeCopied
                                from b in directoriesThatWillBeCreated
                                where a.FullName.StartsWith(b.FullName)
                                select b).ToList();

This isn't a complete solution, though. 但是,这不是一个完整的解决方案。 If FilesThatWillBeCopied has M items and directoriesThatWillBeCreated has N elements, then your query is going to process MxN string comparisons. 如果FilesThatWillBeCopied具有M个项目,而directoriesThatWillBeCreated具有N个元素,则您的查询将处理MxN字符串比较。

Another Option 另外一个选项

Another optimization to try, iterate through directoriesThatWillBeCreated first, then select those that match any FileInfo in FilesThatWillBeCopied . 尝试进行的另一种优化,首先遍历directoriesThatWillBeCreated ,然后选择与FilesThatWillBeCopied中的任何FileInfo匹配的FilesThatWillBeCopied By checking if any match, you could break out of testing the files once a match is found. 通过检查是否有匹配项,一旦找到匹配项,您就可以停止测试文件。 That could be done like this: (warning, notepad code follows) 可以这样进行:(警告,记事本代码如下)

directoriesThatWillBeCreated = directoryThatWillBeCreated
    .Select(b => FilesThatWillBeCopied
    .Any(a => a.FullName.StartsWith(b.FullName)));

I would suggest using HashSet<DirectoryInfo> for comparisons, but unfortunately, DirectoryInfo doesn't have proper equality comparisons implemented, so strings will have to do. 我建议使用HashSet<DirectoryInfo>进行比较,但是不幸的是, DirectoryInfo没有实现适当的相等性比较,因此必须使用字符串。 (Another option would be to implement your own IComparer<DirectoryInfo> .) Also, you should use StringComparer.InvariantCultureIgnoreCase on the names unless you are sure that both collections have the same case. (另一种选择是实现自己的IComparer<DirectoryInfo> 。)此外,除非您确定两个集合的大小写相同,否则应在名称上使用StringComparer.InvariantCultureIgnoreCase

var dirs = new HashSet<string>(StringComparer.InvariantCultureIgnoreCase);
// fill dirs

var files = new List<FileInfo>();
// fill files

var result = new HashSet<string>(StringComparer.InvariantCultureIgnoreCase);

foreach (var file in files)
{
    var dir = file.Directory;
    while (dir != null && !result.Contains(dir.FullName))
    {
        if (dirs.Contains(dir.FullName))
            result.Add(dir.FullName);
        dir = dir.Parent;
    }
}

This solution doesn't use LINQ at all, but that's often the case when you're after performance and the most straight-forward LINQ solution is too slow. 该解决方案根本不使用LINQ,但是当您追求性能并且最直接的LINQ解决方案太慢时,通常就是这种情况。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM