简体   繁体   中英

C# - How to use Regex to replace NULL character?

I have the following string:

< \0\"\0E\0x\0t\0e\0n\0s\0i\0b\0i\0l\0i\0t\0y\0,\0v\0e\0r\0s\0i\0o\0n\0=\0\\\0\"\07\0.\00\0.\03\03\00\00\0.\00\0\\\0\"\0,\0p\0u\0b\0l\0i\0c\0K\0e\0y\0T\0o\0k\0e\0n\0=\0\\\0\"\0B\00\03\0F\05\0F\07\0F\01\01\0D\05\00\0A\03\0A\0\\\0\"\0,\0f\0i\0l\0e\0V\0e\0r\0s\0i\0o\0n\0=\0\\\0\"\07\0.\00\0.\09\04\06\06\0.\01\0\\\0\"\0,\0c\0u\0l\0t\0u\0r\0e\0=\0\\\0\"\0n\0e\0u\0t\0r\0a\0l\0\\\0\"\0\"\0=\0h\0e\0x\0(\07\0)\0:\07\08\0,\0\\\0"

In notepad++ it looks something like: 在此处输入图像描述

I would like to replace all "NULL" instances using Regex, but I can't seem to get the correct search pattern. This is my code:

        FileInfo file = new FileInfo(path);
        string line;
        using (StreamReader reader = new StreamReader(file.FullName))
        {
            while ((line = reader.ReadLine()) != null)
            {
                Regex rgx = new Regex(@"^[\00|\0]");
                line = rgx.Replace(line, "");

                System.Console.WriteLine(line);
                CurrentLine++;
            }
        }

However, this does not appear to be replacing any text. What would the correct search pattern for this be?

您不需要正则表达式,可以使用String.Replace()

line = line.Replace("\u0000", "");

The problem with your regex is the ^ character which means that your regex will only look at the start of the string for the NULL character. Take it off and your code will work just fine.

如果您只想替换Null字符,则不能只使用String.Replace

line = line.Replace("\0", "");

You already got your code working thanks to the accepted answer, and someone else already pointed out that regex was not needed for this in the first place. This answer is about improving your regex pattern.

There are several ways to specify special characters in .NET regex patterns, as shown in the documentation .

Here are the documented ways to specify a null character:

  • @"\00" - ASCII octal 0 (2 digits)
  • @"\000" - ASCII octal 0 (3 digits)
  • @"\x00" - ASCII hexadecimal 0
  • @" " - UTF-16 hexadecimal 0

These undocumented methods also seem to work, based on my testing:

  • @"\0" (regex testing tools like regex101.com flag it as a pattern error)
  • "\0" (mixing actual special characters into your pattern seems like bad practice to me)

So the full pattern in your code could have just been @"\x00" or one of the other options above.

Here is an explanation of your actual pattern: @"[\00|\0]" . I removed the ^ since it's already been discussed.

  • [] is a character set, so it'll match any character inside the brackets
  • \00 is the null character
  • | is just | . Maybe you were trying to use it to mean "or", but it doesn't mean that when inside brackets.
  • \0 is the null character, again

So @"[\00|\0]" means "match one (null or | or null)."

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM