简体   繁体   中英

Regex.Replace with large strings and backslashes

I have written a utility which opens a text based file, loads is as a string and performs a find / replace function using RegEx.Replace .

It does this on many files, the user points it at a folder, enters a find string, a replace string and all the files in the folder which contain the string in the file get replaced.

This works great until I try it with a backslash where it falls down.

Quite simply:

newFileContent = Regex.Replace(fileContent, @findString, @replaceString, RegexOptions.IgnoreCase);

fileContent = the contents of a text based file. it will contain carriage returns.

findString = user entered string to find

replaceString = user entered string to replace the found string with

I've tried adding some logic to counter act the backslash as below, but this fails with illegal at end of pattern.

 if (culture.CompareInfo.IndexOf(findString, @"\") >= 0)
     {
      Regex.Replace(findString, @"\", @"\\");
     }

What do I need to do to successfully handle backslashes so they can be part of the find / replace logic?

Entire code block below.

//open reader
                using (var reader = new StreamReader(f,Encoding.Default)) 
                {
                    //read file
                    var fileContent = reader.ReadToEnd();

                    Globals.AppendTextToLine(string.Format(" replacing string"));

                    //culture find replace
                    var culture = new CultureInfo("en-gb", false);
                    //ensure nothing has changed
                    if (culture.CompareInfo.IndexOf(fileContent, findString, CompareOptions.IgnoreCase) >= 0)
                    {

                        //if find or replace string contains backslahes
                        if (culture.CompareInfo.IndexOf(findString, @"\") >= 0)
                        {
                            Regex.Replace(findString, @"\", @"\\");
                        }

                        //perform replace in new string
                        if (MainWindow.Main.chkIgnoreCase.IsChecked != null && (bool) MainWindow.Main.chkIgnoreCase.IsChecked)                        
                            newFileContent = Regex.Replace(fileContent, @findString, @replaceString, RegexOptions.IgnoreCase);
                        else
                            newFileContent = Regex.Replace(fileContent, @findString, @replaceString);

                        result[i].Result = true;
                        Globals.AppendTextToLine(string.Format(" success!"));
                    }
                    else
                    {
                        Globals.AppendTextToLine(string.Format(" failure!!"));
                        break;
                    }
                }

You should be using Regex.Escape when you pass the user-input into the Replace method.

Escapes a minimal set of characters (\\, *, +, ?, |, {, [, (,), ^, $, ., #, and white space) by replacing them with their escape codes. This instructs the regular expression engine to interpret these characters literally rather than as metacharacters.

For example:

newFileContent = Regex.Replace(fileContent,
                               Regex.Escape(findString),
                               replaceString,
                               RegexOptions.IgnoreCase);

Your fundamental issue is that your letting your user enter an arbitrary regexp and thus, well, its interpreted as a regexp...

either you goal is just to replace literal strings, in which-case use String.Replace OR you want to allow a user to enter a regexp, in which case just accept that the user will need to \\ escape their special characters.

Since \\ is a regexp escape char (As well as c# one but you seem to be dealing with that with @) "\\" is an illegal regexp because what are you escaping

If you Really want a rexexp to replace all \\ with \\\\ then its:

Regex.Replace(findString, @"\\", @"\\\\"); --ie one \ after escape, two chars after escape.

But you've still got [].?* etc to worry about.

My strong advice is a checkbox, user can select if they are entering a regexp or string literal for replacement and then call String.Replace or Regex.Replace accordingly

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM