简体   繁体   English

Ruby正则表达式:否定匹配

[英]Ruby regular expressions: negative matching

I was wondering if it was possible to use negative matching on whole words, so that something like [^(<em>.*?<\\/em>)] would match everything but text between (and including) <em>...</em> . 我想知道是否有可能对整个单词使用否定匹配,以便[^(<em>.*?<\\/em>)] 东西可以匹配 (包括) <em>...</em>之间的文字以外的所有内容<em>...</em>

I was thinking about using negative lookahead, but I don't think this will work, as I need to check for the opening <em> as well. 我当时正在考虑使用负前瞻,但我认为这不会起作用,因为我还需要检查开头的<em>

Of course, I could just use the positive regex and then subtract the matches from the original text, but I'm looking for a more 'elegant' solution. 当然,我可以只使用正则表达式,然后从原始文本中减去匹配项,但是我正在寻找更“优雅”的解决方案。

thx for any help 感谢任何帮助

String#split works as negative match. String#split用作否定匹配。 It returns you an array of whatever part that does not match the regex. 它返回不匹配正则表达式的任何部分的数组。

'XXXXXXXX<em>YYYYYYY</em>ZZZZZZZZ'.split(%r|<em>.*?</em>|)
# => ['XXXXXXX', 'ZZZZZZZZ']

And if want it back into a string, just do join . 如果想将其返回为字符串,只需join

'XXXXXXXX<em>YYYYYYY</em>ZZZZZZZZ'.split(%r|<em>.*?</em>|).join
 # => 'XXXXXXXZZZZZZZZ'

The whole thing with lookaround is that it doesn't consume any of the input. 环顾四周的整个过程是,它不消耗任何输入。 If you want to match everything but a pattern, it means that you want to match the prefix and the suffix of that pattern. 如果要匹配模式以外的所有内容,则意味着要匹配该模式的前缀和后缀。 To match the suffix, you probably want to consume --- and throw away -- the pattern that you don't want. 为了匹配后缀,您可能想要消耗---并丢弃-不需要的模式。 But negative lookahead doesn't consume. 但是负面的前瞻并没有消耗。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM