简体   繁体   English

使用正则表达式拆分模式

[英]Using regex to split pattern

I am new to using pattern in regular expression. 我是在正则表达式中使用模式的新手。 I have read a couple of links from Microsoft site, thought I understood but I came across this scenario and do not know why it is not providing the results I expected. 我已经阅读了Microsoft网站上的几个链接,以为我理解了,但是我遇到了这种情况,并且不知道为什么它没有提供我期望的结果。

I would like to split MyCmd into a list of strings: print,a,+,b,; 我想将MyCmd拆分为字符串列表:print,a,+,b ,; Normal split will not keep the delimiters as far as I understand. 据我所知,正常分割不会使定界符保持不变。 So, I think I have tried using regex with the pattern defined below: (basically I want to split the string into a list or queue and keep the delimiters ;,+-*/{}[]). 因此,我想我已经尝试过将正则表达式与以下定义的模式一起使用:(基本上,我想将字符串拆分为列表或队列,并保留定界符;,+-* / {} [])。

 string MyCmd = "print a+b;";
 private string MyDelim = @"\b[\s;,\+\-\*\/%=\<\>\(\)\{\}\[\]]\w+";
 myStuff = new Queue<string>(Regex.Split(MyCmd,MyDelim));

But so far, my code above is not yielding the expected results. 但是到目前为止,我上面的代码仍未产生预期的结果。

What is not correct in my pattern? 我的模式有什么不正确的地方?

I believe you can use 我相信你可以用

var MyCmd = "print a+b;";
var MyDelim = @"([][\s;,+*/%=<>(){}-])";
var myStuff = Regex.Split(MyCmd,MyDelim).Where(p=> !string.IsNullOrWhiteSpace(p)).ToList();

Output: print , a , + , b , ; 输出: printa+b ;

在此处输入图片说明

Note that the ([][\\s;,+*/%=<>(){}-]) regex is enclosed with (...) and that capturing group makes sure the captured values also get added to the resulting array. 请注意, ([][\\s;,+*/%=<>(){}-])正则表达式用(...)括起来,并且捕获组确保将捕获的值也添加到结果数组中。

You need the .Where(p=> !string.IsNullOrWhiteSpace(p)) to get rid of empty values that you will get with Regex.Split . 您需要使用.Where(p=> !string.IsNullOrWhiteSpace(p))来摆脱Regex.Split将获得的空值。

I removed excessive escaping in your regex so that it looks "lean and mean". 我删除了您的正则表达式中过多的转义符,使它看起来“精简而卑鄙”。

The reason your regex does not work is that @"\\b[\\s;,\\+\\-\\*\\/%=\\<\\>\\(\\)\\{\\}\\[\\]]\\w+" matches the spaces and symbols in the character class after a \\b word boundary (requiring a word character to appear before them) and then it matched one or more word characters. 您的正则表达式不起作用的原因是@"\\b[\\s;,\\+\\-\\*\\/%=\\<\\>\\(\\)\\{\\}\\[\\]]\\w+"\\b单词边界之后的字符类中的空格和符号(要求单词字符出现在它们之前),然后匹配一个或多个单词字符。 Since there is no capturing group, all the matches disappeared from the resulting array. 由于没有捕获组,因此所有匹配项均从结果数组中消失。

Regex.Split is the way to go. Regex.Split是必经之路。 If using Regex.Matches , you can do like this: 如果使用Regex.Matches ,则可以这样:

string MyCmd = @"print  a+b;";
string MyDelim = @"([^;,+\-*/{}\[\]\)\(\s]+|[;,+\-*/{}\[\]\)\(])";
var myStuff = Regex.Matches(MyCmd, MyDelim).Cast<Match>().ToList().ConvertAll(m => m.Groups[1].Value.ToString());

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM