[英]What regex can capture 2 exact 'words' in a phrase?
I'm trying to capture words constant in a string. 我正在尝试捕获字符串中恒定的单词。 That constant is:
该常数是:
For the sake example let's say I'm looking for "Bob 1", in the following strings: 举个例子,假设我在以下字符串中寻找“ Bob 1”:
Hello, I'm Bob 1 --> Should capture Bob 1
Hello, I'm Bob 11 --> Should capture nothing (Bob 1 is not at the end or followed by a separator)
Hey, it's Bob-1 over there --> Should capture Bob-1
Hey, it's Bob - 1 over there --> Should capture nothing (Bob should be followed only by one separator not 3 like here)
Bob.1 --> Should capture Bob.1
Bob_1 rules! --> Should capture Bob_1
I have a regex that mostly works: 我有一个最有效的正则表达式:
/Bob[\s._-]1[\s._-]/ig
In the second list I don't know how to add the end of the string in the possible characters... Which ends in only the last line in the live demo below that should be a match and that isn't captured. 在第二个列表中,我不知道如何在可能的字符中添加字符串的结尾...该结尾仅位于下面的实时演示的最后一行中,这应该是一个匹配项,并且不会被捕获。
I use pcre (in PHP). 我使用pcre(在PHP中)。
I'm not using PHP, but the following matches for me: 我没有使用PHP,但是以下匹配项适合我:
\bBob[\s.\-_]1\b
It is making use of \\b
which matches against a word boundary. 它利用
\\b
匹配单词边界。 I found that I had to escape the dash inside the square brackets, which isn't something you are doing but that may be a difference between the regex engines we are using. 我发现我必须将方括号内的破折号转义,这不是您要执行的操作,但这可能是我们使用的regex引擎之间的区别。
This works 这有效
https://regex101.com/r/ezikuP/2 https://regex101.com/r/ezikuP/2
(?<!\\S)Bob[\\s._-]1(?![^\\s._-])
(?<! \S ) # Whitespace boundary
Bob # Word 1
[\s._-] # Special seperator
1 # Word 2
(?! [^\s._-] ) # Special seperator boundary
Which ends in only the last line in the live demo below that should be a match and that isn't captured.
仅在下面的实时演示中的最后一行结束,这应该是一个匹配项,并且不会被捕获。
For that you need a positive lookahead. 为此,您需要积极向前。
Regex: Bob[\\s._-]1(?=[\\s._-])
正则表达式:
Bob[\\s._-]1(?=[\\s._-])
(?=[\\s._-])
will only look for give character class and won't match/capture it. (?=[\\s._-])
将只查找给定字符类,而不会匹配/捕获它。 In the second list I don't know how to add the end of the string in the possible characters.
在第二个列表中,我不知道如何在可能的字符中添加字符串的结尾。
You can use this regex with anchor $
to assert end of string: 您可以将此正则表达式与锚点
$
一起使用来声明字符串的结尾:
/\bBob[\s._-]1(?:[\s._-]|$)/m
OR if you don't want to match next character after 2nd word then use a lookahead: 或者,如果您不想在第二个单词之后匹配下一个字符,请使用前瞻:
/\bBob[\s._-]1(?=[\s._-]|$)/m
([\\s._-]|$)
will assert presence of given (one of whitespace, DOT, Underscore, Hyphen) characters or end of line $
. ([\\s._-]|$)
将断言给定字符(空格,DOT,下划线,连字符之一)或行$
。
It is safer to add \\b
before Bob
to match exact word Bob
and avoid matching HelloBob
在
Bob
之前添加\\b
以匹配确切的单词Bob
并避免匹配HelloBob
是更安全的
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.