简体   繁体   English

在C#中清理文件路径而不影响驱动器号

[英]Sanitizing a file path in C# without compromising the drive letter

I need to process some file paths in C# that potentially contain illegal characters, for example: 我需要在C#中处理一些可能包含非法字符的文件路径,例如:

C:\path\something\output_at_13:26:43.txt

in that path, the : s in the timestamp make the filename invalid, and I want to replace them with another safe character. 在该路径中,时间戳中的: s使文件名无效,我想用另一个安全字符替换它们。

I've searched for solutions here on SO, but they seem to be all based around something like: 我在SO上搜索了解决方案,但它们似乎都基于以下内容:

path = string.Join("_", path.Split(Path.GetInvalidFileNameChars()));

or similar solutions. 或类似的解决方案 These solutions however are not good, because they screw up the drive letter, and I obtain an output of: 然而,这些解决方案并不好,因为它们搞砸了驱动器号,我获得了以下输出:

C_\path\something\output_at_13_26_43.txt

I tried using Path.GetInvalidPathChars() but it still doesn't work, because it doesn't include the : in the illegal characters, so it doesn't replace the ones in the filename. 我尝试使用Path.GetInvalidPathChars()但它仍然不起作用,因为它不包括:在非法字符中,因此它不会替换文件名中的那些。

So, after figuring that out, I tried doing this: 所以,在搞清楚后,我尝试这样做:

string dir = Path.GetDirectoryName(path);
string file = Path.GetFileName(path);
file = string.Join(replacement, file.Split(Path.GetInvalidFileNameChars()));
dir = string.Join(replacement, dir.Split(Path.GetInvalidPathChars()));

path = Path.Combine(dir, file);

but this is not good either, because the : s in the filename seem to interfere with the Path.GetFilename() logic, and it only returns the last piece after the last : , so I'm losing pieces of the path. 但这也不好,因为文件名中的: s似乎干扰了Path.GetFilename()逻辑,它只返回最后一个后面的最后一个:所以我丢失了路径。

How do I do this "properly" without hacky solutions? 如果没有hacky解决方案,我该怎么做“正确”?

You can write a simple sanitizer that iterates each character and knows when to expect the colon as a drive separator. 您可以编写一个简单的清理程序来迭代每个字符,并知道何时将冒号作为驱动器分隔符。 This one will catch any combination of letter AZ followed directly by a ":". 这个将捕获字母AZ的任意组合,后面跟着“:”。 It will also detect path separators and not escape them. 它还将检测路径分隔符而不是它们。 It will not detect whitespace at the beginning of the input string, so in case your input data might come with them, you will have to trim it first or modify the sanitizer accordingly: 它不会在输入字符串的开头检测到空格,因此如果您的输入数据可能随附,您必须先修剪它或相应地修改清洁剂:

enum ParserState {
    PossibleDriveLetter,
    PossibleDriveLetterSeparator,
    Path
}

static string SanitizeFileName(string input) {
    StringBuilder output = new StringBuilder(input.Length);
    ParserState state = ParserState.PossibleDriveLetter;
    foreach(char current in input) {
        if (((current >= 'a') && (current <= 'z')) || ((current >= 'A') && (current <= 'Z'))) {
            output.Append(current);
            if (state == ParserState.PossibleDriveLetter) {
                state = ParserState.PossibleDriveLetterSeparator;
            }
            else {
                state = ParserState.Path;
            }
        }
        else if ((current == Path.DirectorySeparatorChar) ||
            (current == Path.AltDirectorySeparatorChar) ||
            ((current == ':') && (state == ParserState.PossibleDriveLetterSeparator)) ||
            !Path.GetInvalidFileNameChars().Contains(current)) {

            output.Append(current);
            state = ParserState.Path;
        }
        else {
            output.Append('_');
            state = ParserState.Path;
        }
    }
    return output.ToString();
}

You can try it out here . 你可以在这里试试

You definitely should make sure that you only receive valid filenames. 您一定要确保只收到有效的文件名。

If you can't, and you're certain your directory names will be, you could split the path the last backslash (assuming Windows) and reassemble the string: 如果您不能,并且您确定您的目录名称是,则可以将路径拆分为最后一个反斜杠(假设为Windows)并重新组合字符串:

public static string SanitizePath(string path)
{
    var lastBackslash = path.LastIndexOf('\\');

    var dir = path.Substring(0, lastBackslash);
    var file = path.Substring(lastBackslash, path.Length - lastBackslash);

    foreach (var invalid in Path.GetInvalidFileNameChars())
    {
        file = file.Replace(invalid, '_');
    }

    return dir + file;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM