I'm writing ac# program to update the starting comment -that is commonly the license header- of java source code. The following snippet do the job.
foreach (string r in allfiles)
{
// GC.Collect();
string thefile = System.IO.File.ReadAllText(r);
var pattern = @"/\*(?s:.*?)\*/[\s\S]*?package";
Regex regex1 = new Regex(pattern /*,RegexOptions.Compiled */) ;
var replaced = regex1.Replace(thefile, newheader + "package");
System.IO.File.WriteAllText(r, replaced);
}
The problem is that after hundreds of source file processed the process hang at .Replace
It's not a matter of Garbage Collection as forcing it don't solve the issue. And doesn't matter if RegexOptions.Compiled or not.
I'm quite sure it depends on an issue in the pattern as the hanging appear on some files that -if removed from processing- let the job continue till the end of one thousand of source file. But if I process these files alone, it work and also work if I use an online testing tool as http://regexstorm.net/tester https://www.myregextester.com/index.php
Please let me know if there is any way to optimize better the search pattern for finding the first Java comment in a file.
Thank you in advance.
Your regex contains 2 bottlenecks related to lazy dot matching ( .
in singleline mode and [\\s\\S]*?
are synonyms). The backtracking buffer may get easily and quickly overrun when running a regex against big files.
The common technique is to unroll/unwrap the construct with the negated character class and a quantified group.
You may use
@"/\*[^*]*(?:\*(?!/)[^*]*)*\*/\s*package"
See regex demo
The regex breakdown:
/\\*
- literal /*
[^*]*
- 0 or more characters other than *
(?:\\*(?!/)[^*]*)*
- the unrolled variant of (?s:.*?)
, matching 0 or more sequences of...
\\*(?!/)
- a *
symbol not followed by a /
[^*]*
- 0 or more symbols other than *
\\*/
- a literal sequence of */
\\s*
- 0 or more whitespace characters package
- literal letter sequence package
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.