[英]Ruby regular expressions: negative matching
I was wondering if it was possible to use negative matching on whole words, so that something like [^(<em>.*?<\\/em>)]
would match everything but text between (and including) <em>...</em>
. 我想知道是否有可能对整个单词使用否定匹配,以便[^(<em>.*?<\\/em>)]
东西可以匹配除 (包括) <em>...</em>
之间的文字以外的所有内容<em>...</em>
。
I was thinking about using negative lookahead, but I don't think this will work, as I need to check for the opening <em>
as well. 我当时正在考虑使用负前瞻,但我认为这不会起作用,因为我还需要检查开头的<em>
。
Of course, I could just use the positive regex and then subtract the matches from the original text, but I'm looking for a more 'elegant' solution. 当然,我可以只使用正则表达式,然后从原始文本中减去匹配项,但是我正在寻找更“优雅”的解决方案。
thx for any help 感谢任何帮助
String#split
works as negative match. String#split
用作否定匹配。 It returns you an array of whatever part that does not match the regex. 它返回不匹配正则表达式的任何部分的数组。
'XXXXXXXX<em>YYYYYYY</em>ZZZZZZZZ'.split(%r|<em>.*?</em>|)
# => ['XXXXXXX', 'ZZZZZZZZ']
And if want it back into a string, just do join
. 如果想将其返回为字符串,只需join
。
'XXXXXXXX<em>YYYYYYY</em>ZZZZZZZZ'.split(%r|<em>.*?</em>|).join
# => 'XXXXXXXZZZZZZZZ'
The whole thing with lookaround is that it doesn't consume any of the input. 环顾四周的整个过程是,它不消耗任何输入。 If you want to match everything but a pattern, it means that you want to match the prefix and the suffix of that pattern. 如果要匹配模式以外的所有内容,则意味着要匹配该模式的前缀和后缀。 To match the suffix, you probably want to consume --- and throw away -- the pattern that you don't want. 为了匹配后缀,您可能想要消耗---并丢弃-不需要的模式。 But negative lookahead doesn't consume. 但是负面的前瞻并没有消耗。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.