简体   繁体   English

VB.net的RegEx

[英]RegEx for VB.net

I have a txt file with content 我有一个包含内容的txt文件

$NETS  
P3V3_AUX_LGATE;  PQ6.8 PU37.2   
U335_PIN1;  R3328.1 U335.1  
$END  

need to be updated in this format, and save back to another txt file 需要以这种格式进行更新,并保存回另一个txt文件

$NETS  
'P3V3_AUX_LGATE';  PQ6.8 PU37.2  
'U335_PIN1';  R3328.1 U335.1  
$END

NOTE: number of lines may go up to 10,000 lines 注意: 行数可能会达到10,000行

My current solution is to read the txt file line by line, detect the presence of the ";" 我当前的解决方案是逐行读取txt文件,检测是否存在“;” and newline character and do the changes. 和换行符并进行更改。

Right now i have a variable that holds ALL the lines, is there other way something like Replace via RegEx to do the changes without looping thru each line, this way i can readily print the result 现在我有一个保存所有行的变量,还有其他方法,例如通过RegEx进行替换,无需循环遍历每行即可进行更改,这样我就可以轻松打印结果

and follow up question, which one is more efficient? 并跟进问题,哪个更有效?

You could probably find all the matches using something like \\w+; 您可能可以使用\\w+;类的来找到所有匹配项\\w+; but I don't know how you'd be able to do a replace on that using Regex.Replace to add the ' s but keep the original match. 但我不知道您将如何使用Regex.Replace替换添加'但保持原始匹配。

However, if you already have it as one variable, you don't have to read the file again, either you could make your code find all ; 但是,如果您已经将其作为一个变量,则无需再次读取文件,也可以使代码全部找到;否则,无需执行任何操作; s and then find the previous newline for each, or you could use a String.Split on newlines to split the variable you've already got into lines. s,然后为每个找到前一个换行符,或者可以在换行符上使用String.Split将已经存在的变量拆分为几行。 And if you want to get it back to one variable you can just use String.Join . 如果想将其返回到一个变量,则可以使用String.Join

Personally I'd normally use the String.Split (and possibly the String.Join if needed) method, since I think that would make the code easy to read. 就我个人而言,我通常会使用String.Split (如果需要的话,也可以使用String.Join )方法,因为我认为这会使代码易于阅读。

Try 尝试

ResultString = Regex.Replace(SubjectString, "^([^;\r\n]+);", "'$1';", RegexOptions.Multiline)

on your multiline string. 在您的多行字符串上。

This will find any string (length one or more) at the start of a line up until the first semicolon if there is one and replace it with its quoted equivalent. 这将在一行的开头找到任何字符串(长度为一个或多个),直到第一个分号为止(如果有),并将其替换为引用的等效字符串。

It should be more efficient than looping through the string line by line as you're doing now, but if you're in doubt, you'd have to profile it. 它比您现在一步一步地遍历字符串要有效,但是如果您有疑问,则必须对其进行概要分析。

I would say Yes! 我会说是的! this can be done with Regular expressions. 这可以使用正则表达式来完成。 Make sure you got the "multiline" option turned on and craft your regular expression using some capture groups to ease the work. 确保您已启用“多行”选项,并使用一些捕获组来制作正则表达式以简化工作。

I can however say this will NOT be the optimal one. 但是我可以说这不是最佳选择。 Since you mention the amount of lines you could be processing, it seems 'resource wise' smarter to use a streaming approach instead of the in memory approach. 由于您提到了可能要处理的行数,因此使用流方法而不是内存中方法似乎在“资源上明智”的做法更明智。

Taking the Regex approach (and this took 15 mins so please don't think this is an optimal solution, just prove it would work) 采用Regex方法(这花了15分钟,所以请不要以为这是最佳解决方案,只需证明它可以工作)

    private static Regex matcher = new Regex(@"^\$NETS\r\n(?<entrytitle>.[^;]*);\s*(?<entryrest>.*)\r\n(?<entrytitle2>.[^;]*);\s*(?<entryrest2>.*)\r\n\$END\r\n", RegexOptions.Compiled | RegexOptions.Multiline);
    static void Main(string[] args)
    {
        string newString = matcher.Replace(ExampleFileContent, new MatchEvaluator(evaluator));
    }

    static string evaluator(Match m)
    {
        return String.Format("$NETS\r\n'{0}'; {1}\r\n'{2}'; {3}\r\n$END\r\n",
                              m.Groups["entrytitle"].Value,
                              m.Groups["entryrest"].Value,
                              m.Groups["entrytitle2"].Value,
                              m.Groups["entryrest2"].Value);            
    }

Hope this helps, 希望这可以帮助,

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM