简体   繁体   English

如何提高正则表达式的性能?

[英]How to improve the performance of the regular expression?

I have some problem. 我有问题 I need replace some text inside string: 我需要替换字符串中的一些文本:

var str = "<tr class=\"fieldType\"><td><a href=\"#\" onclick=\"javascript:removeNestedForm(this,&#39;tr.fieldType&#39;,&#39;.mark-for-delete&#39;,false);return false;\">Удалить</a><input data-val=\"true\" data-val-number=\"The field Id must be a number.\" data-val-required=\"The Id field is required.\" id=\"FieldTypes[0]_635503397304941429__Id\" name=\"nestedObject.Id\" type=\"hidden\" value=\"0\" /><input data-val=\"true\" data-val-number=\"The field DocTypeId must be a number.\" data-val-required=\"The DocTypeId field is required.\" id=\"FieldTypes[0]_635503397304941429__DocTypeId\" name=\"nestedObject.DocTypeId\" type=\"hidden\" value=\"0\" /><input class=\"mark-for-delete\" data-val=\"true\" data-val-required=\"The IsDel field is required.\" id=\"FieldTypes[0]_635503397304941429__IsDel\" name=\"nestedObject.IsDel\" type=\"hidden\" value=\"False\" /><input data-val=\"true\" data-val-required=\"The CanDel field is required.\" id=\"FieldTypes[0]_635503397304941429__CanDel\" name=\"nestedObject.CanDel\" type=\"hidden\" value=\"True\" />    </td>    <td>        <select id=\"FieldTypes[0]_635503397304941429__Convertion\" name=\"nestedObject.Convertion\" style=\"width:99%;\"><option value=\"int\">int</option><option value=\"string\">string</option></select></td><td><input data-val=\"true\" data-val-required=\"Название не может быть пустым\" id=\"FieldTypes[0]_635503397304941429__Name\" name=\"nestedObject.Name\" style=\"width:99%;\" type=\"text\" value=\"\"></td><td><input id=\"FieldTypes[0]_635503397304941429__Description\" name=\"nestedObject.Description\" style=\"width:99%;\" type=\"text\" value=\"\" /></td></tr>";

I use method: 我使用方法:

private static string ReplaceAttribute(string source, string name, string found, string replaced)
{
    string pattern = string.Format(@"({0}=[\\""]*(\w*[._\[\]]?)*)({1})", name, found);
    string replacement = "$1" + replaced;

    var theRegex = new Regex(pattern, RegexOptions.Compiled | RegexOptions.Singleline);
    var result = theRegex.Replace(source, replacement);

    return result;
}

My code works for a long time: 我的代码可以使用很长时间:

strPartial = ReplaceAttribute(strPartial, "id", propertyNameFake, collectionProperty + "_" + ticks + "_");
strPartial = ReplaceAttribute(strPartial, "name", propertyNameFake, collectionProperty + "[" + ticks + "]");
strPartial = ReplaceAttribute(strPartial, "data-valmsg-for", propertyNameFake, collectionProperty + "[" + ticks + "]");

How to improve the performance of the regular expression? 如何提高正则表达式的性能? Thanks. 谢谢。

You are looking for backslashes and quotation marks, but there are no backslashes before the quotation marks in the code, that's only in the C# string literal. 您正在寻找反斜杠和引号,但是在代码中的引号之前没有反斜杠,仅在C#字符串文字中。 Just look for the quotation marks, ie ""? 仅查找引号,即""? instead of [\\\\""]* . 而不是[\\\\""]* (Note also that \\\\ in a @ delimited string ends up as \\\\ in the string, not \\ ). (还要注意\\\\@分隔字符串结束的\\\\的字符串,而不是\\ )。

But here comes the real speadup; 但是真正的speadup来了。 You have conditional values nested inside each other, ie optional alphanumerics followed by an optional separator, repeated zero or more times: (\\w*[._\\[\\]]?)*) . 您有相互嵌套的条件值,即,可选的字母数字,后跟可选的分隔符,重复零次或多次: (\\w*[._\\[\\]]?)*) Instead you should just use a set with the characters: [\\w\\._\\[\\]]* . 相反,您应该只使用包含以下字符的集合: [\\w\\._\\[\\]]*

When the string is matched, the conditional values will start by matching as much as possible, then backtrack to find the longest match where the rest of the pattern matches. 当字符串匹配时,条件值将从尽可能多的匹配开始,然后回溯以找到模式其余部分匹配的最长匹配项。 With nested conditionals there will be an huge amount of backtracking. 嵌套的条件句将产生大量的回溯。

When I tested the changes with your example string the code runs about 600 times faster (11 ms instead of 6240 ms). 当我用示例字符串测试更改时,代码运行速度提高了约600倍(11毫秒而不是6240毫秒)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM