简体   繁体   English

Regex.Replace包含大字符串和反斜杠

[英]Regex.Replace with large strings and backslashes

I have written a utility which opens a text based file, loads is as a string and performs a find / replace function using RegEx.Replace . 我编写了一个实用程序,它打开一个基于文本的文件,load作为一个字符串,并使用RegEx.Replace执行查找/替换功能。

It does this on many files, the user points it at a folder, enters a find string, a replace string and all the files in the folder which contain the string in the file get replaced. 它在许多文件上执行此操作,用户将其指向文件夹,输入查找字符串,替换字符串以及文件夹中包含文件中的字符串的所有文件都被替换。

This works great until I try it with a backslash where it falls down. 这很有效,直到我用反斜杠尝试它才会失败。

Quite simply: 很简单:

newFileContent = Regex.Replace(fileContent, @findString, @replaceString, RegexOptions.IgnoreCase);

fileContent = the contents of a text based file. fileContent =基于文本的文件的内容。 it will contain carriage returns. 它将包含回车。

findString = user entered string to find findString =用户输入要查找的字符串

replaceString = user entered string to replace the found string with replaceString =用户输入的字符串替换找到的字符串

I've tried adding some logic to counter act the backslash as below, but this fails with illegal at end of pattern. 我已经尝试添加一些逻辑来反击下面的反斜杠,但是在模式结束时这会失败。

 if (culture.CompareInfo.IndexOf(findString, @"\") >= 0)
     {
      Regex.Replace(findString, @"\", @"\\");
     }

What do I need to do to successfully handle backslashes so they can be part of the find / replace logic? 我需要做什么才能成功处理反斜杠,以便它们可以成为查找/替换逻辑的一部分?

Entire code block below. 下面是整个代码块。

//open reader
                using (var reader = new StreamReader(f,Encoding.Default)) 
                {
                    //read file
                    var fileContent = reader.ReadToEnd();

                    Globals.AppendTextToLine(string.Format(" replacing string"));

                    //culture find replace
                    var culture = new CultureInfo("en-gb", false);
                    //ensure nothing has changed
                    if (culture.CompareInfo.IndexOf(fileContent, findString, CompareOptions.IgnoreCase) >= 0)
                    {

                        //if find or replace string contains backslahes
                        if (culture.CompareInfo.IndexOf(findString, @"\") >= 0)
                        {
                            Regex.Replace(findString, @"\", @"\\");
                        }

                        //perform replace in new string
                        if (MainWindow.Main.chkIgnoreCase.IsChecked != null && (bool) MainWindow.Main.chkIgnoreCase.IsChecked)                        
                            newFileContent = Regex.Replace(fileContent, @findString, @replaceString, RegexOptions.IgnoreCase);
                        else
                            newFileContent = Regex.Replace(fileContent, @findString, @replaceString);

                        result[i].Result = true;
                        Globals.AppendTextToLine(string.Format(" success!"));
                    }
                    else
                    {
                        Globals.AppendTextToLine(string.Format(" failure!!"));
                        break;
                    }
                }

You should be using Regex.Escape when you pass the user-input into the Replace method. 将用户输入传递给Replace方法时,应该使用Regex.Escape

Escapes a minimal set of characters (\\, *, +, ?, |, {, [, (,), ^, $, ., #, and white space) by replacing them with their escape codes. 通过用它们的转义码替换它们来转义一组最小字符(\\,*,+,?,|,{,[,(,),^,$ ,.,#和空格)。 This instructs the regular expression engine to interpret these characters literally rather than as metacharacters. 这指示正则表达式引擎按字面解释这些字符而不是元字符。

For example: 例如:

newFileContent = Regex.Replace(fileContent,
                               Regex.Escape(findString),
                               replaceString,
                               RegexOptions.IgnoreCase);

Your fundamental issue is that your letting your user enter an arbitrary regexp and thus, well, its interpreted as a regexp... 你的根本问题是你让你的用户输入一个任意的正则表达式,因此,它被解释为正则表达式...

either you goal is just to replace literal strings, in which-case use String.Replace OR you want to allow a user to enter a regexp, in which case just accept that the user will need to \\ escape their special characters. 要么你的目标只是替换文字字符串,在这种情况下使用String.Replace或者你允许用户输入正则表达式,在这种情况下只需接受用户将需要\\转义他们的特殊字符。

Since \\ is a regexp escape char (As well as c# one but you seem to be dealing with that with @) "\\" is an illegal regexp because what are you escaping 因为\\是一个regexp转义字符(以及c#one,但你似乎用@处理它)“\\”是一个非法的正则表达式,因为你逃避了什么

If you Really want a rexexp to replace all \\ with \\\\ then its: 如果你真的想要一个rexexp用\\\\替换所有\\,那么它:

Regex.Replace(findString, @"\\", @"\\\\"); --ie one \ after escape, two chars after escape.

But you've still got [].?* etc to worry about. 但你仍然需要担心[]。?*等。

My strong advice is a checkbox, user can select if they are entering a regexp or string literal for replacement and then call String.Replace or Regex.Replace accordingly 我的强烈建议是一个复选框,用户可以选择是否输入正则表达式或字符串文字进行替换,然后相应地调用String.Replace或Regex.Replace。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM