[英]C# - Regex Match whole words
I need to match all the whole words containing a given a string. 我需要匹配包含给定字符串的所有单词。
string s = "ABC.MYTESTING
XYZ.YOUTESTED
ANY.TESTING";
Regex r = new Regex("(?<TM>[!\..]*TEST.*)", ...);
MatchCollection mc = r.Matches(s);
I need the result to be: 我需要的结果是:
MYTESTING
YOUTESTED
TESTING
But I get: 但我得到:
TESTING
TESTED
.TESTING
How do I achieve this with Regular expressions. 如何使用正则表达式实现此目的。
Edit: Extended sample string. 编辑:扩展的示例字符串。
If you were looking for all words including 'TEST', you should use 如果您正在寻找包括'TEST'在内的所有单词,您应该使用
@"(?<TM>\w*TEST\w*)"
\\w includes word characters and is short for [A-Za-z0-9_] \\ w包含单词字符,是[A-Za-z0-9_]的缩写
保持简单:为什么不尝试\\w*TEST\\w*
作为匹配模式。
I get the results you are expecting with the following: 我得到了您期望的结果,具体如下:
string s = @"ABC.MYTESTING
XYZ.YOUTESTED
ANY.TESTING";
var m = Regex.Matches(s, @"(\w*TEST\w*)", RegexOptions.IgnoreCase);
Try using \\b
. 尝试使用\\b
。 It's the regex flag for a non-word delimiter. 它是非单词分隔符的正则表达式标志。 If you wanted to match both words you could use: 如果你想匹配两个单词,你可以使用:
/\b[a-z]+\b/i
BTW, .net doesn't need the surrounding /
, and the i
is just a case-insensitive match flag. BTW,.net不需要周围的/
,而i
只是一个不区分大小写的匹配标志。
.NET Alternative: .NET替代方案:
var re = new Regex(@"\b[a-z]+\b", RegexOptions.IgnoreCase);
Using Groups I think you can achieve it. 使用组我认为你可以实现它。
string s = @"ABC.TESTING
XYZ.TESTED";
Regex r = new Regex(@"(?<TM>[!\..]*(?<test>TEST.*))", RegexOptions.Multiline);
var mc= r.Matches(s);
foreach (Match match in mc)
{
Console.WriteLine(match.Groups["test"]);
}
Works exactly like you want. 工作完全像你想要的。
BTW, your regular expression pattern should be a verbatim string ( @"") 顺便说一下,你的正则表达式模式应该是一个逐字字符串(@“”)
Regex r = new Regex(@"(?<TM>[^.]*TEST.*)", RegexOptions.IgnoreCase);
First, as @manojlds said, you should use verbatim strings for regexes whenever possible. 首先,正如@manojlds所说,你应该尽可能使用逐字符串作为正则表达式。 Otherwise you'll have to use two backslashes in most of your regex escape sequences, not just one (eg [!\\\\..]*
). 否则,你必须在大多数正则表达式转义序列中使用两个反斜杠,而不只是一个(例如[!\\\\..]*
)。
Second, if you want to match anything but a dot, that part of the regex should be [^.]*
. 其次,如果你想匹配除了点之外的任何东西,正则表达式的那部分应该是[^.]*
。 ^
is the metacharacter that inverts the character class, not !
^
是反转字符类的元字符,而不是!
, and .
,和.
has no special meaning in that context, so it doesn't need to be escaped. 在该上下文中没有特殊含义,因此不需要进行转义。 But you should probably use \\w*
instead, or even [AZ]*
, depending on what exactly you mean by "word". 但你应该使用\\w*
代替,甚至[AZ]*
,这取决于你对“单词”的确切含义。 [!\\..]
matches !
[!\\..]
匹配!
or .
或.
. 。
Regex r = new Regex(@"(?<TM>[A-Z]*TEST[A-Z]*)", RegexOptions.IgnoreCase);
That way you don't need to bother with word boundaries, though they don't hurt: 这样你就不需要打扰单词边界,尽管它们不会受到伤害:
Regex r = new Regex(@"(?<TM>\b[A-Z]*TEST[A-Z]*\b)", RegexOptions.IgnoreCase);
Finally, if you're always taking the whole match anyway, you don't need to use a capturing group: 最后,如果你总是拿着整场比赛,你不需要使用捕获组:
Regex r = new Regex(@"\b[A-Z]*TEST[A-Z]*\b", RegexOptions.IgnoreCase);
The matched text will be available via Match's Value
property. 匹配的文本将通过Match的Value
属性提供。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.