简体   繁体   中英

Remove invalid characters from string using Regex in C#

I found several posts on this topic but solution mentioned there doesn't work in my case.

Consider following code:

    static void Main(string[] args)
    {
        string rgs = "^[ -~]*(?:\r?\n[ -~]*)*$";

        string TestStrNoMatch = "One\tTwo\r\nThree Ö";
        string TestStrMatch = "OneTwo\r\nThree ";

        Regex rgx = new Regex(rgs);

        bool Match = rgx.IsMatch(TestStrNoMatch); // false

        Match = rgx.IsMatch(TestStrMatch); // true

        string result = Regex.Replace(TestStrNoMatch, rgs, "");

        // result is the same as TestStrNoMatch
    }

Expected result is for \\t and Ö to be removed but this is not happening. Value of result is exactly the same as TestStrNoMatch

CLARIFICATION : The regex I use in my example only allows characters between space and ~ (English letters, numbers and some special characters) and new line in Windows and Unix format. I want to remove everything else.

Your regex needs to match the characters you want to remove in order for regex.replace to work. Because your pattern doesn't match anything, nothing gets replaced. It's unclear what specifically you want to remove, but here's an example:

The pattern (\\\\t)|(Ö) matches the tab and Ö characters, so

    string sample = "ab\tcefÖ";
    string pattern = "(\\t)|(Ö)";
    string result = Regex.Replace(sample, pattern, "");
    System.Console.WriteLine("SAMPLE : " + sample);
    System.Console.WriteLine("RESULT : " + result);

Results in

SAMPLE: ab      cefÖ
RESULT: abcef

If you explain what all you want to remove exactly, I can point you towards a more representative regex pattern. Eg, to remove all characters between space and ~, as well as tabs, you could use [^ -~]|(\\\\t) .

Why not just do this instead of using Regex? Better readability in my opinion.

string text = "abcdef";
char[] invalidChars = { 'a', 'b', 'c' }; // Your invalid characters here

if (text.IndexOfAny(invalidChars) != -1)
{
    text = new String(text.Where(c => !invalidChars.Contains(c)).ToArray());
}

output: "def"

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM