简体   繁体   English

带+,*的C#正则表达式\\ w似乎无法正常工作

[英]C# regexp \w with +, * doesn't seem to work properly

I've got such a piece of code: 我有一段代码:

String source = "There will will be";
Regex r = new Regex(@"There \w+ be");
Console.WriteLine(r.Match(source).Value);

And I can't see anything in the output, if the source is There will be, I can see the output. 而且我在输出中看不到任何东西,如果源是,我可以看到输出。 Could anyone explain it to me? 有人可以向我解释吗?

And some extent to a question. 在某种程度上是一个问题。 How to create a Regex that will find between 1 or 2 (that's an example, I'm writing some kind of parser and need to create my own wildchar that behaves that way) words. 如何创建一个可以找到1或2之间的正则表达式(例如,我正在编写某种解析器,并且需要创建自己的以这种方式表现的Wildchar)单词。 I've already tried a few combinations but everything fails. 我已经尝试了几种组合,但是一切都失败了。 One of my tries: 我的尝试之一:

@"\w+\s{1,2}"

I think it's wrong becouse {1,2} tells that regexp to repeat whitespace 1 or 2 times, not whole \\w+\\s. 我认为这是错误的,因为{1,2}告诉正则表达式重复空白1或2次,而不是整个\\ w + \\ s。 Do you know how to fix it or make in a different way? 您知道如何修复它或以其他方式制造吗?

The reason for this is the fact that this code won't match. 原因是该代码不匹配。 \\w can be any alphanumerical character and the underscore (so essentially A through Z , 0 through 9 and _ ). \\w可以是任何字母数字字符和下划线(因此本质上是AZ09_ )。 Spaces however are their own group (represented by \\s ). 但是,空格是它们自己的组(由\\s表示)。

To fix this, you can make the regular expression match both by creating a group where the matching algorithm can pick any element using [] : 要解决此问题,您可以通过创建一个分组来使正则表达式匹配,从而使匹配算法可以使用[]选择任何元素:

There [\\w\\s]+? be

Note that I also added a ? 请注意,我还添加了一个? to make this a non-greedy match, trying to match a small as possible part of the text (otherwise you could just skip a be while matching). 要使其成为非贪心匹配项,请尝试匹配尽可能小的文本部分(否则,您可以在匹配时跳过be )。


As for the addition, just use a non-matching group (saving some processing time and memory compared to a matching group): 至于添加,只需使用一个不匹配的组(与匹配的组相比,可以节省一些处理时间和内存):

(?:\\w+\\s){1,2}

\\w matches any word character(a-zA-Z0-9 and underscore). \\w匹配任何单词字符(a-zA-Z0-9和下划线)。 There will will be would require \\w+ to match will will . There will will be \\w+来匹配will will \\w can't match the space, thus the regex doesn't match. \\w不能匹配空格,因此正则表达式不匹配。

That's because space is not matched within \\w+ . 这是因为\\w+空格不匹配。 Try using either of the following: 尝试使用以下任一方法:

@"There \w+ \w+ be"

or 要么

@"There [\w\s]+ be"

\\w matches all alphanumeric characters and the underscore. \\w匹配所有字母数字字符和下划线。 In your example it would have to match will will which contains a space and therefore does not match. 在您的示例中,它必须与will will包含空格的will will相匹配,因此不匹配。 Your expression will however match There will be with only one will . 但你的表达将匹配There will be只有一个will

\\w matches word characters, so the space between the two 'will' strings prevents the match. \\ w匹配单词字符,因此两个“ will”字符串之间的空格会阻止匹配。 You might want to replace it with @"There \\w+(?:\\s+\\w+)* be" instead. 您可能想用@"There \\w+(?:\\s+\\w+)* be"代替它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM