简体   繁体   English

删除 C# 中的零件文件名重复项

[英]Delete part filename duplicates in C#

I have a directory filled with files that have part of the name as duplicate,我有一个目录,里面有部分名称重复的文件,

eg例如

"afilename[a].txt" “文件名[a].txt”

"afilename[f].txt" “文件名[f].txt”

"afilename[j].txt" “文件名[j].txt”

I would like to delete all files, but the first file, that contain "afilename", so I will be left with "afilename[a].txt"我想删除所有文件,但第一个文件包含“afilename”,所以我将留下“afilename[a].txt”

I have the following code,我有以下代码,

var duplicateNames = files.GroupBy(file => file.Name)
                          .Where(group => group.Count() > 1)
                          .Select(group => group.Key);

But is it possible to compare the filenames upto the "[" and then return a list of the full file name, but skipping the first file found?但是是否可以将文件名与“[”进行比较,然后返回完整文件名的列表,但跳过找到的第一个文件?

Many thanks.非常感谢。

----------Solution I used was: - ----------我使用的解决方案是:-

using System;

public class Program
{
    public static void Main()
    {

        // get all files
        string directoryPath = "C:\\temp\\dups";
        var allFiles = Directory.GetFiles(directoryPath, "*.mp3", SearchOption.AllDirectories);

        var toKeep = allFiles.Where(file => file.Contains('['))
                 .GroupBy(file => file.Remove(file.IndexOf('[')))
                 .Select(group => group.Min())
                 .ToHashSet();

        var whatToDelete = allFiles.Except(toKeep);

        foreach (var fileToDelete in whatToDelete)
        {
            System.IO.File.Delete(fileToDelete);
        }
    }
}

Sure.当然。 Let's take only files with [ in the name, group them by only the part of the string up to that, order the result, skip the first one and expand the group back into a list of names that should be removed让我们只取名称中带有[的文件,仅按字符串的一部分对它们进行分组,对结果进行排序,跳过第一个并将组扩展回应该删除的名称列表

var duplicateNames = files.Where(file => file.Contains('[')
                      .GroupBy(file => file.Name.Remove(file.IndexOf('[')))
                      .Where(group => group.Count() > 1)
                      .SelectMany(group => group.OrderBy(file => file).Skip(1));

The only thing you might need to think about, is that filenames on windows are not case sensitive, but strings in c# are, so aFileName[a].txt and aFILEname[b].txt are probably a "b should be deleted" but this won't pick them up.您可能需要考虑的唯一一件事是,windows 上的文件名不区分大小写,但 c# 中的字符串是,所以aFileName[a].txtaFILEname[b].txt可能是“b 应该被删除”但是这不会接他们。 Perhaps lowercase the result of the Remove when you group, and when you OrderBy..当您分组和 OrderBy.. 时,可能会小写 Remove 的结果。

Note;笔记; you've said "i want to remove all except the first", but you haven't directly said anything about what you consider to be a first.您说过“我想删除除第一个以外的所有内容”,但您没有直接说出您认为是第一个的内容。 If your lists are already in the order you want them, then you could ditch the ordering before the Skip because generally LINQ preserves order, but I'd advise you to impose some order so that you can be sure of what will be deleted, rather than rely on some innate ordering of the input.如果您的列表已经按照您想要的顺序排列,那么您可以在 Skip 之前放弃排序,因为通常 LINQ 保留顺序,但我建议您强加一些顺序,以便您可以确定将删除的内容,而不是而不是依赖于输入的一些先天排序。

Another approach, incidentally, might be easier:顺便说一句,另一种方法可能更容易:

var toKeep = files.Where(file => file.Contains('[')
                 .GroupBy(file => file.Name.Remove(file.IndexOf('[')))
                 .Select(group => group.Min())
                 .ToHashSet()

This will generate a hashset of the files you want to keep.这将生成您要保留的文件的哈希集。 You could either then do files.Except(toKeep) to generate the list to remove, or just loop over files issuing delete commands for any file not in toKeep .然后,您可以执行files.Except(toKeep)以生成要删除的列表,或者只是遍历文件,为不在toKeep中的任何files发出删除命令。 The same case-sens considerations apply to Min() as they do to GroupBy , in this latter example just like the firstMin()和对GroupBy的情况相同的考虑因素,在后一个示例中,就像第一个示例一样

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM